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ABSTRACT 

This report addresses an opportunity to accelerate 
progress in virtually every branch of science and engineering 
concurrently, while also boosting the American economy as business 
firms also learn to exploit these new capabilities. The successful 
rapid advancement in both science and technology creates its own 
challenges, four of which are outlined here for the National Science 
Board. Four sets of interdependent recommendations are made in 
response to the challenges. The first implements a balanced pyramid 
of computing environments. Each element in the pyramid supports the 
others; whatever resources are applied to the whole, the balance in 
the pyramid should be sustained. The second set addresses the 
essential research investments and other steps to remove the 
obstacles to realizing the technologies in the pyramid and the 
barriers to the effective use of these environments. The third set 
addresses the institutional structure for delivery of the HPC 
capabilities, and consists itself of a pyramid. At the base of the 
institutional pyramid is the diverse array of investigators in their 
universities and other settings, who use all the facilities at all 
levels of the pyramid, followed by departments and research groups 
devoted to specific areas of computer science and engineering, and 
the National Science Foundation (NSF) high performance computing 
(HPC) Centers. At the apex is the national teraf lop-class society, 
which is recommended as a multi-agency facility pushing the frontiers 
of high performance into the next decade. A final recommendation 
addresses the NSF role at the national level and its relationship 
with the states in HPC. Concepts are illustrated with two figures and 
two tables. Appendices include: a list of the membership of the Blue 
Ribbon Panel Oil High Performance Computing; information on the 
history and origin of this study on the NSF and HPC; a discussion of 
technology trends and barriers to further progress; four figures 
illustrating supercomputer data; and a review and prospectus of 
computational and computer science and engineering with personal 
statements by panel members. (MAS) 
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a builder of important academic institutions, and a devoted and effective public servant; 
Professor Nico Habermann. Dr. Habermann took responsibility in organizing this Panel 
work and saw it through to completion, but passed away just a few days before it 
was presented to the National Science Board. The members of the panel deeply 
feel the loss of his creativity, wisdom, and friendship. 
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EXECUTIVE SUMMARY 



An Introductory Remark: Many reports are 
prepared for the National Science Board and the 
National Science Foundation that make an elo- 
quent case for more resources for one discipline or 
another. This is not such a report. This report ad- 
dresses an opportunity to accelerate progress in 
virtually every branch of science and engineering 
concurrently, while also giving a shot in the arm to 
the entire American economy as business firms 
also learn to exploit these new capabilities. The 
way much of science and engineering are prac- 
ticed will be transformed, if our recommendations 
are implemented. 

The National Science Board can take pride in the 
Foundation's accomplishments in the decade since 
it implemented the recommendations of the Peter 
Lax Report on high performance computing 
(HPC). The Foundation's High Performance Com- 
puting Centers continue to play a central role in 
this successful strategy, creating an enthusiastic 
and demanding set of sophisticated users, who 
have acquired the specialized computational skills 
required to use the fast advancing but still imma- 
ture high performance computing technology. 

Stimulated by this growing user community, the 
HPC industry finds itself in a state of excitement 
and transition. The very success of the NSF pro- 
gram, together with those of sister agencies, has 
given rise to a growing variety of new experimen- 
tal computing environments, from massively paral- 
lel systems to networks of coupled workstations, 
that could, with the right research investments, 
produce entirely new levels of computing power, 
economy, and usability. The U.S. enjoys a sub- 
stantial lead in computational science and in the 
emerging technology; it is urgent that the NSF 
capitalize on this lead, which not only offers scien- 
tific preeminence but also the industrial lead in a 
growing world market. 

The vision of the rapid advances in both science 
and technology that the new generation of super- 
computers could make possible has been shown to 
be realistic. This very success, measured in terms 
of new discoveries, the thousands of researchers 



and engineers who have gained experience in 
HPC, and the extraordinary technical progress in 
realizing new computing environments, creates its 
own challenges. We invite the Board to consider 
four such challenges: 

Challenge 1: How can NSF, as the nation's 
premier agency funding basic research, remove ex- 
isting barriers to the rapid evolution of high perfor- 
mance computing, making it truly usable by all 
the nation's scientists and engineers? These bar- 
riers are of two kinds: technological barriers 
(primarily to realizing the promise of highly paral- 
lel machines, workstations, and networks) and im- 
plementation barriers (new mathematical methods 
and new ways to formulate science and engineer- 
ing problems for efficient and effective computa- 
tion). An aggressive commitment by NSF to 
leadership in research and prototype development, 
in both computer science and in computational 
science, will be required. 

Challenge 2: How can NSF provide scalable ac- 
cess to a pyramid of computing resources, from 
the high performance workstations needed by 
most scientists to the critically needed teraflop- 
and-beyond capability required for solving Grand 
Challenge problems? What balance of among 
high performance desktop workstations, vs. mid- 
range or mini-supercomputer, vs. networks of 
workstations, vs. remote, shared supercomputers 
of very high performance should NSF anticipate 
and encourage? 

Challenge 3: The third challenge is to encourage 
the continued broadening of the base of participa- 
tion in HPC, both in terms of institutions and in 
terms of skill levels and disciplines. This calls for 
expanded education and training, and participation 
by state-based and other HPC institutions. 

Challenge 4: How can NSF best create the intel- 
lectual and management leadership for the future 
of high performance computing in the U.S.? What 
role should NSF play within the scope of the na- 
tionally coordinated HPCC program? What 
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relationships should NSF's activities in HPC have 
to the activities of other federal agencies? 

This report recommends significant expansion in 
NSF investments, both in accelerating progress in 
high performance computing through computer 
and computational science research and in provid- 
ing the balanced pyramid of computing facilities 
to the science and engineering communities. The 
cost estimates are only approximate, but in total 
they do not exceed the Administration's stated in- 
tent to double the investments in HPCC during the 
next 5 years. We believe these investments are not 
only justified but are compatible with stated na- 
tional plans, both in absolute amount and in their 
distribution. 

RECOMMENDATIONS: 

We have four sets of interdependent recommenda- 
tions. The first implements a balanced pyramid of 
computing environments (see Figure A following 
this Summary). Each element in the pyramid sup- 
ports the others; whatever resources are applied to 
the whole, the balance in the pyramid should be 
sustained. The second set addresses the essential 
research investments and other steps to remove the 
obstacles to realizing the technologies in the 
pyramid and the barriers to the effective use of 
these environments. 

The third set addresses the institutional structure 
for delivery of HPC capabilities, and consists it- 
self of a pyramid (see Figure B following this 
Summary), of which the NSF Centers are an im- 
portant part. At the base of the institutional 
pyramid is the diverse array of investigators in 
their universities and other settings, who use all 
the facilities at all levels of the pyramid. At the 
next level are departments and research groups 
devoted to specific areas of computer science or 
computational science and engineering. At the 
next level are the NSF HPC Centers, which must 
continue to be providers of shared high capability 
computing systems and to provide aggregations of 
specialized capability for all aspects of use and ad- 
vance of high performance computing. At the apex 
is the national teraflop-class facility, which we 



recommend as a multi-agency facility pushing the 
frontiers of high performance into the next decade. 

A final recommendation addresses the NSF's role 
at the national level and its relationship with the 
states in HPC. 

A. CENTRAL GOAL FOR 
NSF HPC POLICY 

Recommendation A-l: The National Science 
Board should take the lead, under OSTP guidance 
and in collaboration with ARPA, DoE and other 
agencies, to expand access to all levels of the 
dynamically evolving pyramid of high perfor- 
mance computing capability for all sectors of the 
whole nation. The realization of this pyramid 
depends, of course, on rapid progress in the 
pyramid's technologies. The computational 
capability we envision includes not only the re- 
search capability for which NSF has special 
stewardship, but also includes a rapid expansion 
of capability in business and industry to use HPC 
profitably, and many operational uses of HPC in 
commercial and military activities. 

VISION OF THE HPC PYRAMID 

Recommendation A-2: At the apex of the 
pyramid is the need for a national capability at the 
highest level of computing power the industry can 
support with both efficient software and hardware. 
A reasonable goal would be the design, develop- 
ment, and realization of a national teraflop-class 
capability, subject to the successful development 
of software and computational tools for such a 
large machine (recommendation B-l). NSF 
should initiate, through OSTP, an interagency plan 
to make this investment, anticipating multi-agency 
funding and usage. 

Recommendation A-3: Over a period of 5 years 
the research universities should be assisted to ac- 
quire mid-range machines. These mid-sized 
machines are the underfunded element of the 
pyramid today — about 10% of NSF's FY92 HPC 
budget is devoted to their acquisition. They are 
needed for both demanding science and engineer- 
ing problems that do not require the very maxi- 



mum in computing capacity, and for use by the 
computer science and computational mathematics 
community in addressing the architectural, 
software, and algorithmic issues that are the 
primary barriers to progress with massively paral- 
lel processor architectures. 

Recommendation A-4: We recommend that NSF 
double the current annual level of investment ($22 
million) providing scientific and engineering 
workstations to its 20,000 principal investigators. 
Within 4 or 5 years workstations delivering up to 
400 megaflops costing no more than $15,000 to 
$20,000 should be widely available. For education 
and a large fraction of the computational needs of 
science and engineering, these facilities will be 
adequate. 

Recommendation A-5: We recommend that the 
NSF expand its New Technologies program to sup- 
port expanded testing of the new parallel con- 
figurations for HPC applications. For example, 
the use of Gigabit local area networks to link 
workstations may meet a significant segment of 
mid-range HPC science and engineering applica- 
tions. A significant supplement to HPC applica- 
tions research capacity can be had with minimal 
additional cost if such collections of workstations 
prove practical and efficient. 

B. RECOMMENDATIONS TO 
IMPLEMENT THESE GOALS 

REMOVING BARRIERS TO HPC 
TECHNICAL PROGRESS AND HPC USAGE 

Recommendation B-l: To accelerate progress in 
developing the HPC technology needed by users, 
NSF should create, in the Directorate for Com- 
puter and Information Science and Engineering, a 
challenge program in computer science with grant 
size and equipment access sufficient to support the 
systems and algorithm research needed for more 
rapid progress in HPC capability. The Centers, in 
collaboration with hardware and software vendors, 
can provide test platforms for much of this work, 
and recommendation A-3 provides the hardware 



support required for initial development of 
prototypes. 

Recommendation B-2: A significant barrier to 
rapid progress in HPC application lies in the for- 
mulation of the computational strategy for solving 
a scientific or engineering problem. In response to 
Challenge 1, the NSF should focus attention, both 
through CISE and through its disciplinary pro- 
gram offices, on support for the design and 
development of computational techniques, algo- 
rithmic methodology, and mathematical, physical 
and engineering models to make efficient use of 
the machines. 

BALANCING THE PYRAMID OF 
HPC ACCESS 

Recommendation B-3: We reco r jnend NSF set 
up a task force to develop a way to ameliorate the 
imbalance in the HPC "pyramid" — the underin- 
vestment in the emerging mid-range scalable, 
parallel computers and the inequality of access to 
stand-alone (but potentially networked) worksta- 
tions in the disciplines. This implementation plan 
should involve a combination of funding by dis- 
ciplinary program offices and some form cf more 
centralized allocation of NSF resources. 

C. THE NSF HPC CENTERS 

Recommendation C-l : The Centers should be 
retained and their missions should be reaffirmed. 
However, the NSF HPC effort now embraces a 
variety of institutions and programs — HPC 
Centers, Engineering Research Centers, and 
Science & Technology Centers devoted to HPC re- 
search, and disciplinary investments in computer 
and computational science and applied mathe- 
matics — all of which are essential elements of 
the HPC effort needed for the next decade. Fur- 
thermore, HPC institutions outside the NSF orbit 
also contribute to the goals for which the NSF 
Centers are chartered. Thus we ask the Board to 
recognize that the overall structure of the HPC pro- 
grain at NSF will have more institutional diversity, 
more flexibility, and more interdependence with 
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other agencies and private institutions than was 
possible in the early years of the HPC initiative. 

The NSF should continue its current practice of 
encouraging HPC Center collaboration, both 
with one another and with other entities 
engaged in HPC work. The division of the sup- 
port budget into one component committed to 
the centers and another for muiti-center ac- 
tivities is a useful management tool, even 
though it may have the effect of reducing com- 
petition among centers. The National Consor- 
tium for HPC (NCHPC), formed by NSF and 
ARPA is a welcome measure as well. 

Recommendation C-2 : The current situation in 
HPC is both more exciting, more turbulent, and 
more filled with promise of really big benefits to 
the nation than at any time since the Lax report; 
this is not the time to "sunset" a successful, chang- 
ing venture, of which the Centers remain an impor- 
tant part. Furthermore, we also recommend 
against re-competition of the four Centers at this 
time, favoring periodic performance evaluation 
and competition for some elements of their ac- 
tivities, both among Centers and when appropriate 
with other HPC Centers such as those operated by 
states (see Recommendation D-l). 

Recommendation C-3 : The mission of the 
Centers is to foster rapid progress in the use of 
HPC by scientists and engineers, to accelerate 
progress in usability and economy of HPC and to 
diffuse HPC capability throughout the technical 
community, including industry. Provision to scien- 
tists and engineers of access to leading edge super- 
computer resources will contine to be a primary 
purpose of the Centers. The following additional 
components of the Center missions should be af- 
firmed: 

♦ Supporting computational science, by re- 
search and demonstration in the solution of 
significant science and engineering problems. 

♦ Fostering interdisciplinary collaboration — 
across sciences and between sciences and 
computational science and computer science 
— as in the Grand Challenge programs. 



♦ Prototyping and evaluating software, new ar- 
chitectures, and the uses of high speed data 
communications in collaboration with: com- 
puter and computational scientists, discipli- 
nary scientists exploiting HPC resources, the 
HPC industry, and business firms exploring 
expanded use of HPC. 

♦ Training and education, from post-docs and 
faculty specialists to introduction of less ex- 
perienced researchers to HPC methods, to col- 
laboration with state and regional HPC 
centers working with high schools and com- 
munity colleges. 

ALLOCATION OF CENTER HPC 
RESOURCES TO INVESTIGATORS 

Recommendation C-4: The NSF should continue 
to monitor the administrative procedures used to 
allocate Center resources, and the relationship of 
this process to the initial funding of the research 
by the disciplinary program offices, to ensure that 
the burden on scientists applying for research sup- 
port is minimized. NSF should continue to pro- 
vide HPC resources to the research community 
through allocation committees that evaluate com- 
petitively proposals for use of Center resources. 

EDUCATION AND TRAINING 

Recommendation C-5: The NSF should give 
strong emphasis to its education mission in HPC, 
and should actively seek collaboration with state- 
sponsored and other HPC centers not supported 
primarily on NSF funding. Supercomputing 
regional affiliates should be candidates for NSF 
support, with education as a key role. HPC will 
also figure in the Administration's industrial exten- 
sion program, in which the states have the primary 
operational role. 
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D. NSF AND THE NATIONAL HPC 
EFFORT; RELATIONSHIPS WITH 
THE STATES 

Recommendation D-l: We recommend that NSF 
urge OSTP to establish an advisory committee rep- 
resenting the states, HPC users, NSF Centers, com- 



puter manufacturers, computer and computational 
scientists (similar to the Federal Networking 
Council's Advisory Committee), which should 
report to HPCCIT. A particularly important role 
for this body would be to facilitate state-federal 
planning related to high performance computing. 
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INTRODUCTION AND BACKGROUND 



A revolution is underway in the practice of science 
and engineering, arising from advances in com- 
putational science and new models for scientific 
phenomena, and made possible by advances in 
computer science and technology. The importance 
of this revolution is not yet fully appreciated be- 
cause of the limited fraction of the technical com- 
munity that has developed the skills required and 
has access to high performance computational 
resources. These skill and access barriers can be 
dramatically lowered, and if they are, a new level 
of creativity and progress in science and engineer- 
ing may be realized which will be quite different 
from that known in the past This report is about 
that opportunity for all of science and engineering; 
it is not about the needs of one or two specialized 
disciplines. 

A little over a decade ago, the National Science 
Board convened a panel chaired by Prof. Peter 
Lax to explore what should NSF do to exploit the 
potential for science and industry of the rapid ad- 
vances in high performance computing. 1 The ac- 
tions taken by the NSF with the encouragement of 
the Board to implement the "Large Scale Comput- 
ing in Science and Engineering" Report of 1982 
have helped computing foster a revolution in 
science and engineering research and practice, in 
academic institutions and to a lesser extent in in- 
dustrial applications. At the time, centralized 
facilities were the only way to provide access to 
high performance computing, which compelled 
the Lax panel to recommend the establishment of 
NSF Supercomputer Centers interconnected by a 
high speed network. The new revolution is charac- 
terized both by advances in the power of super- 
computers and by the diffusion throughout the 
nation of access to and experience with using high 



Report of the Panel on Large Scale Computing in Science 
and Engineering, Peter Lax, chairman, commissioned by the 
National Science Board in cooperation with the U.S. Depart- 
ment of Defense, Department of Energy, and the National 
Aeronautics and Space Administration, December 26, 1982. 



performance computing. This success has opened 
up a vast set of new research and applications 
problems amenable to solution through high levels 
of computational power and better computational 
tools. 

The key features of the new capabilities include: 

♦ The power of the big, multiprocessing vector 
supercomputers, today's workhorse of supercom- 
puting, has increased by a factor of 100 to 200 
since the Lax Report. 3 

♦ An exciting array of massively parallel proces- 
sors (MPP) have appeared in the market, offering 
three possibilities: an acceleration in the rate of ad- 
vance of peak processing power, an improvement 
in the ratio ot performance to cost, and the option 
to grow the power of an installation incrementally 
as the need arises. 4 

♦ Switched networks based on high speed digi- 
tal communications are extending access to major 
computational facilities, permitting the dynamic 
redeployment of computing power to suit the 
users* needs, and improving connectivity among 
collaborating users. 



2 With every new generation of computing machines, the 
capability associated with "high performance computing'* 
changes. High performance computing (HPC) may be 
defined as "a computation and communications capability 
that allows individuals and groups to extend their ability to 
solve research, design, and modelling problems substantially 
beyond that available to them before." This definition recog- 
nkes that HPC is a relative and changing concept. For the 
PC user a scientific workstation is high performance comput- 
ing. For the technical people with specialized skill in com- 
putational science and access to high performance facilities, 
a reasonable level for 1992-1993 might be 1 Gflop for a vec- 
tor machine and 2 Gflops for a MPP system. 

3 As noted in Appendix C, the clock speed of a single vector 
processor has only increased by a factor of 5 to 6 since 1976, 
but a 16-way Cray C-90 with one additional vector pipe mul- 
tiplies the effective spedd by the estimated factor of a hundred 
or more. 

4 The promise (not yet realized) of massively parallel systems 
is a much higher degree of installed capacity expandability with 
minimal disruption to the user's programming. 
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♦ Technical progress in computer science and 
microelectronics have transformed yesterday's su- 
percomputers into today's emerging desktop 
workstations. These workstations offer more 
flexible tradeoffs between ease of access and in- 
herent computing power and can be coupled to the 
largest supercomputers over a national network, 
used in locally-networked clusters, or as stand- 
alone processors. 

♦ Advances in computer architectures, computa- 
tional mathematics, algorithmic modeling, and 
software, along with new computer architectures, 
are solving some of the most intractable but impor- 
tant scientific, technical, and economic problems 
facing our society. 

To address these changes, the National Science 
Board charged this panel with taking a fresh look 
at the current situation and new directions that 
might be required. (See Appendix A for institution- 
al identification of the panel membership and Ap- 
pendix B for historical background leading to the 
present study and the Charge to the Panel.) 

To provide both direction and potential to exploit 
these advai ;es, a leadership role for the NSF con- 
tinues to be required. The goal of this report is to 
suggest how NSF should evolve its role in high 
performance computing. Our belief that NSF can 
and should continue to exert influence in these 
fields is based in part on its past successes 
achieved through the NSF Program in High Perfor- 
mance Computing and Communications. 

Achievements Since the Lax Report 

In the past 10 years, the NSF Program in High Per- 
formance Computing and Communications has: 

♦ Facilitated many new scientific discoveries 
and new industrial processes, and supported fun- 
damental work which has led to advances in ar- 
chitectures, tools and algorithms for 
computational science. 

In Appendix E of this report several panel 
members describe examples of those ac- 
complishments and suggest their personal 
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visions for what may be even more dramatic 
progress in the future. 

♦ Supported fundamental work in computer 
science and engineering which has led to advances 
in architectures, tools, and algorithms for computa- 
tional science. 

♦ Initiated collaborations with many companies 
to help them realize the economic and technologi- 
cal benefits of high performance computing. 

Caterpillar Inc. uses supercomputing to 
model diesel engines in an attempt to reduce 
emissions. 

Dow Chemical Company simulates and 
visualizes fluid flow in chemical processes to 
ensure complete mixing. 

USX has turned to supercomputing to im- 
prove the hot rolling process-control systems 
used in steel manufacturing. 

Solar Turbine, Inc. applies computational 
finite-element methods to the design of very 
complex mechanical systems. 

♦ Opened up supercomputer access to a wide 
range of researchers and industrial scientists and 
engineers. 

This was one of the key recommendations of 
the Lax Report. The establishment of the 
four NSF Supercomputer Centers (in addition 
to NCAR) has been extraordinarily success- 
ful. By providing network access, through 
the NSFNET and Internet linkages, NSF has 
put these computing resources at the finger- 
tips of scientists, engineers, mathematicians 
and other professionals all over the nation. 
Users seldom need to go personally to these 
Centers; in fact, the distribution of computa- 
tional cycles by the four NSF Supercomputer 
Centers shows surprisingly little geographic 
bias. This extension of compute power, away 
from dedicated, on-site facilities and towards 
a seamless national computing environment 
has been instrumental in creating the condi- 
tions required for advances on a broad front 
in science, engineering, and the tools of com- 
putational science. There seems to be a lack 
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of geographic bias in users - Figure 1 in Ap- 
pendix D shows users widely distributed 
across the United States. 

♦ Educated literally thousands of scientists, en- 
gineers and students, as well as a new generation 
of researchers who now use computational science 
equally with theory and experiment. 

At the time of the Lax Report access to the 
most advanced facilities was restricted to a 
relatively small set of users. Furthermore su- 
percomputing was regarded by many scien- 
tists as either an inaccessible tool or as an 
inelegantly brute force approach to science. 
The i < ! SF program successfully inoculated vir- 
tually all of the disciplines with the realiza- 
tion that HPC is both a powerful and a 
practical tool for many purposes. These NSF 
initiatives have not only pushed the technol- 
ogy and computational science ahead in 
sophistication and power, they have helped 
bring high performance computing to a large 
fraction of the technical community. There 
has been a 5-fold increase in number of NSF 
funded scientists using HPC and a 5-fold in- 
crease in ratio of graduate students to faculty 
using HPC through the NSF Supercomputer 
Centers. (See Figure 2 of Appendix D) 

♦ Provided the HPC industry a committed, en- 
thusiastic, and dedicated class of expert users who 
share their experience and ideas with vendors, ac- 
celerating the evolutionary improvement in the 
technology and its software. 

One of the problems in the migration of new 
technologies from experimental environ- 
ments to production modes are the inherent 
risks in committing substantial resources 
towards converting existing codes and 
developing software tools. The NSF Super- 
computer Centers have provided a proving 
ground for these new technologies; various 
industrial players have entered into partner- 
ships with the Centers aimed at accelerating 
this migration while maintaining solid and 
reliable underpinnings. 



♦ Encouraged the Supercomputer Centers to 
leverage their relationship with HPC producers to 
reduce the cost of bringing innovation to the scien- 
tific and engineering communities. 

In recognition of Center activity in improving 
eariy versions of hardware and software for 
high performance computing systems, the 
computer industry has provided equipment at 
favorable prices and important technical sup- 
port. This has allowed researchers earlier and 
more useful access to HPC facilities than 
might have be?n the case under commercial 
terms. 

♦ Joined into successful partnerships with other 
agencies to make coordinated contributions to the 
U.S. capability in HPC. 

A decade ago the United States enjoyed a 
world-wide commercial lead in vector sys- 
tems. In part as the result of more recent 
development and procurement actions of the 
Advanced Projects Agency, the Department 
of Energy, and the National Science Founda- 
tion, the U.S. now has the dominant lead in 
providing new Massively Parallel Processing 
(MPP) systems. 5 As an example, the NSF 
has enabled NSF Supercomputer Center ac- 
quisitions of scalable parallel systems first 
developed under seed money provided by 
ARPA, and thus has been instrumental in 
leveraging ARPA projects into the 



^Massively parallel computers are constructed from large 
numbers of separate processors linked by high speed com- 
munications providing access to each other and to shared I/O 
devices and/or computer memory. There are many different 
architectural forms of MPP machines, but they have in com- 
mon economies of scale from the use of microprocessors 
produced at high volumes and the ability to combine them at 
many levels of aggregation. The challenge in using such 
machines is to formulate the problem so that it can be decom- 
posed and nin efficiently on most or all of the processors 
concurrently. Some scientific problems lend themselves to 
parallel computation much more easily than others, suggest- 
ing that improved utility of MPP machines will not be 
availed in all fields of science at once. 
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mainstream. 6 (Figure 3 of Appendix D 
shows data on the uptake of advanced com- 
puting by sector across the world). * 



The Lax Report 

All of these accomplishments have, in a large part, 
arisen from the response by NSF to the recommen- 
dations of the 1982 Lax report "Large Scale Com- 
puting in Science and Engineering". These 
recommendations included: 

♦ Increase access to regularly upgraded super- 
computing facilities via high bandwidth net- 
works. 

♦ Increase research in computational mathe- 
matics, software, and algorithms. 

♦ Train people in scientific computing. 

♦ Invest in research on new supercomputer sys- 
tems. 

For several reasons, NSF's investment in computa- 
tional research and training has been a startling 
success. First, there has been a widespread accep- 
tance of computational science as a vital com- 
ponent and tool in scientific and technological 
understanding. Second, there have been revolution- 
ary advances in computing technology in the past 
decade. And third, the demonstrated ability to 
solve key critical problems has advanced the 
progress of mathematics, science and engineering 
in many important ways, and has created great 
demand for additional HPC resources. 

The New Opportunities in Science and in 
Industry 

As discussed in detail in the Appendix E essays, 
the prospects are for dramatic progress in science 
and engineering and for rapid adoption of com- 



6 Scalable parallel machines are those in which the number 
of processor nodes can be expanded over a wide range 
without substantial changes in either the shared hardware or 
the application interfaces of the operating system. 



putational science in industry. The next major 
HPC revolution may well be in industry, which is 
still seriously under-utilizing HPC (with some ex- 
ceptions such as aerospace, automotive, and 
microelectronics). 

The success of the chemical industry in design- 
ing and simulating pilot plants, of the aircraft in- 
dustry in simulating wind tunnels and 
performing dynamic design evaluation, and in 
the electronics industry in designing integrated 
circuits and modelling the performance of com- 
puters and networks suggests the scale of avail- 
able opportunities. The most important 
requirements are (a) improving the usability and 
efficiency problems of high performance 
machines, and (b) training in HPC for people 
going into industry. The Supercomputer Centers 
have demonstrated they can introduce the com- 
mercial sector to HPC at little cost, and with 
high potential benefits to economy (produc- 
tivity of industry and stimulation of markets for 
U.S. HPC vendors). Success in stimulating 
HPC usage in industry will also accelerate need 
for HPC education and technology, thus exploit- 
ing the benefits of collaboration with univer- 
sities and vendors. The Centers' role can be a 
catalytic one, but often rises to the level of a 
true collaborative partnership with industry, to 
the mutual advantage of the firm and the NSF 
Centers. A3 industrial uses of HPC grow, the 
scientists, mathematicians, and engineers 
benefit from the falling costs and rising 
usability of the new equipment. In addition the 
technological uses of HPC spur new and inter- 
esting problems in science. The following chart 
indicates the increasing importance of advanced 
computing in industry. 

Cray Research Inc. supercomputer sales 



Era 


Percent to 
government 


Percent to 
industry 


Percent to 
Universities 


Early 1980s 


70 


25 


5 


Late 1980s 


60 


25 


15 


Today 


40 


40 


20 
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The New Technology 

Most HPC production work being done today uses 
big vector machines in single processor (or loosely 
coupled multiprocessor) mode. Vectorizing 
Fortran compilers and other software tools are 
well tested and many people have been trained in 
their use. These big shared memory machines will 
continue to be the mainstay of high performance 
computing, at least for the next 5 years or so, and 
perhaps beyond if the promise of massively paral- 
lel supercomputing is delayed longer than many 
expect. 

New desk top computers have made extraordinary 
gains in cost-performance (driven by competition- 
driven commodity microprocessor production). 
Justin Rattner of Intel estimated that in 1996 
microprocessors with clock speeds of 200 MHz 
may power an 800 Mflops peak speed worksta- 
tion. 7 He, and others from the industry, predicted 
the convergence of the clock speeds of 
microprocessor chips and the large vector 
machines such as the Cray C90, perhaps as soon 
as 1995. They held out the likelihood that in 1997 
microprocessors may be available at 1 gigaflop; a 
desktop PC might be available with this speed for 
$10,000 or less. Mid-range workstations will also 
show great growth in capacity; Today one can pur- 
chase a mid-range workstation with a clock speed 
of 200MHz for an entry price of $40,000 to 
$50,000. 

Thus a technical transition is underway from the 
world in which uniprocessor supercomputers were 



distinguished from desktop machines by having 
much faster cycle times, to a world in which cycle 
times converge and the highest levels of computer 
power will be delivered through parallelism, 
memory size and bandwidth, and I/O speed. The 
widespread availability of scientific workstations 
will accelerate the introduction of more scientists 
and engineers to high performance computing, 
resulting in a further acceleration of the need for 
higher performance machines. Early exploration 
of message-passing distributed operating systems 
gives promise of loosely-coupled arrays of 
workstations being used to process large problems 
in the background and when the workstations are 
unused at night, as well as coupling the worksta- 
tions (on which problems are initially designed 
and tested) to the supercomputers located at 
remote facilities. 

Of course, the faster microprocessors also make 
possible new MPP machines of ever increasing 
peak processing speed. MPP is catching on fast, 
as researchers with sufficient expertise (and 
diligence) in computational science are solving a 
growing number of applications that lend themsel- 
ves to highly parallel architectures. In some cases 
those investigators are realizing a ratio of theoreti- 
cal to peak performance approaching that 
achieved by vector machines, with significant cost- 
performance advantages. Efficient use o.f MPP on 
the broad range of scientific and engineering 
problems is still beyond the reach of most inves- 
tigators, however, because of the expertise and ef- 
fort required. Thus the first speculative phase of 
MPP HPC is coming to an end, but its ultimate 
potential is still uncertain and largely unrealized. 



7 The instruction execution speeds of scientific computers are 
generally reckoned in the number of floating point instruc- 
tions that can be executed in one second. Thus a 1 Megaflop 
machine executes 1 million floating point instructions per 
second, a Gigaflop would be one billion instructions per 
second, and a Teraflop 10 12 floating point instructions per 
second. Since different computer architectures may have 
quite different instruction sets one "flop" may not be the 
same as another, either in application power or in the number 
of machine cycles required. To avoid such difficulties, those 
who want to compare machines of different architecture 
generally use a benchmark suite of test cases to measiite 
overall performance on each machine. 
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Limiting progress in all three of these technologies 
is a set of architecture and software issues that are 
discussed below in Recommendations B. Principal 
among them is the evolution of a programming 
model that can allow portability of applications 
software across architectures. 

These technical issues are discussed at greater 
length in Appendix C. 
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FOUR CHALLENGES FOR NSF 

High performance computing is changing very 
fast, and NSF policy must chase a moving target. 
For that reason, the strategy adopted must be agile 
.and flexible in order to capitalize on past invest- 
ments and adapt to the emerging opportunities. 
The Board and the Foundation face four central 
challenges, on which we will make specific recom- 
mendations for policy and action. These challen- 
ges are: 

♦ Removing barriers to the rapid evolution of 
HPC capability 

♦ Providing scalable access to all levels of 
HPC car ability 

• Finding the right incentives to promote ac- 
cess to all three levels of the computational 
power pyramid 

• Creating NSF's intellectual and management 
leadership for the future of high performance 
computing in the U.S. 

CHALLENGE NO. 1: Removing barriers to 
the rapid evolution of HPC capability 

How can NSF, as the nation's premier agency 
funding basic research, remove existing barriers to 
the rapid evolution of High Performance Comput- 
ing? These barriers are of two kinds: technologi- 
cal barriers (primarily to realizing the promise of 
highly parallel machines, workstations, and net- 
works) and exploitation barriers (new mathemati- 
cal methods and new ways to formulate science 
and engineering problems for efficient and effec- 
tive computation). An aggressive commitment by 
NSF to leadership in research and prototype 
development, in both computer science and com- 
putational science, will be required. Indeed, 
NSF's position as the leading provider of HPC 
capability to the nation's scientists and engineers 
will be strengthened if it commands a leadership 
role in technical advances in both areas, which 
will contribute to the nation's economic position 
as well as its position as a world leader in research. 



Computer Science and Engineering. The first 
challenge is to accelerate the development of 
the technology underlying high performance 
computing. Among the largest barriers to effec- 
tive use of the emerging HPC technologies are 
parallel architectures from which it is easy to ex 
tract peak performance, system software (operat 
ing systems, databases of massive size, 
compilers, and programming models) to take ad 
vantage of these architectures and provide por- 
tability of end-user applications, parallel 
algorithms, and advances in visualization techni 
ques to aid in the interpretation of results. The 
technical barriers to progress are discussed in 
Appendix C. What steps will most effectively 
reduce these barriers? 

Computational Tools for Advancing Science 
and Engineering* Research in the develop- 
ment of computational models, the design of al- 
gorithmic techniques, and their accompanying 
mathematical and numerical analysis, is re- 
quired in order to ensure the continued evolu- 
tion of efficient and accurate computational 
algorithms designed to make optimal use of 
these emerging technologies. 

In the past ten years, exciting developments in 
computer architectures, hardware and software 
have come in tandem with stunning 
breakthroughs in computational techniques, 
mathematical analysis, and scientific models. 
For example, the potential of parallel machines 
has been realized in part through new versions 
of numerical linear algebra routines and multi- 
grid techniques; rethinking and reformulating al 
gorithms for computational physics within the 
domain of parallel machines has posed sig- 
nificant and challenging research questions. Ad- 
vances in such areas of N-body solvers, fast 
special function techniques, wavelets, high 
resolution fluid solvers, adaptive mesh techni- 
ques, and approximation theory have generated 
highly sophisticated algorithms to handle com- 
plex problems. At the same time, important 
theoretical advances in the modelling of under- 
lying physical and engineering problems have 
led to new, efficient and accurate discretization 
techniques. Indeed, in the evolution to scalable 

6 16 



computing across a range of levels, designing 
appropriate numerical and computational techni- 
ques is of paramount importance. The challenge 
facing NSF is to weave together existing work 
in these areas, as well as fostering new bridges 
between pure, applied and computational techni- 
ques, engaging the talents of disciplinary scien- 
tists, engineers, and mathematicians. 

CHALLENGE NO. 2: Providing scalable 
access to all levels of HPC capability 

How can NSF provide scalable access to comput- 
ing resources, from the high performance worksta- 
tions needed by most scientists to the critically 
needed teraflop-and-beyond capability required 
for solving Grand Challenge problems? 8 What 
balance should NSF anticipate and encourage 
among high performance desktop workstations, 
mid-range or mini-supercomputers, networks of 
workstations, and remote, shared supercomputers 
of very high performance? 

Flexible strategy. NSF must ensure that ade- 
quate additional computational capacity is avail- 
able to a steadily growing user community to 
solve the next generation of more complex 
science and engineering problems. A flexible 
and responsive strategy that can support the 
large number of evolving options for HPC and 
can adapt to the outcomes of major current 
development efforts (for example in MPP sys- 
tems and in networked workstations) is required. 

A pyramid of computational capability. There 
will continue to be an available spectrum span- 
ning almost five orders of magnitude of com- 
puter capabilities and prices. 9 NSF, as a leader 



8 By scalable access we mean the ability to develop a prob- 
lem on a workstation or intermediate sized machine and 
migrate the problem with relative efficiency to larger 
machines as increased complexity requires it. Scalable ac- 
cess implies scalable architectures and software. 

9 AParagon machine of 300 Gigaflops peak performance would 
be five orders of magnitude faster than a 3 megaflop entry 
workstation. Effective performance in most science applica- 
tions would, however, be perhaps a factor of ten lower. 
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in the national effort in high performance com- 
puting, should support a "pyramid" of comput- 
ing capability. At the apex of the pyramid is the 
highest performance systems that affordable 
technology permits, established at national 
facilities. At the next level, every major re- 
search university should have access to one, or 
a few, intermediate-scale high-performance sys- 
tems and/or aggregated workstation clusters. 10 
At the lowest level are workstations with 
visualization capabilities in sufficient numbers 
to support computational scientists and en- 
gineers. 

Mid-range computational requirements. 

Over the next five years, the middle range of 
scientific computing and computational en- 
gineering will be handled by an amazing variety 
of moderately parallel systems. In some cases, 
these will be scaled-down versions of the 
highest performance systems available; in other 
cases, they will be systems targeted at the 
midrange computing market. The architecture 
will vary from shared memory at one end of the 
spectrum to workstation networks at the other, 
depending on the types of parallelism in the 
local spectrum of applications. Loosely 
coupled networks of workstations will compete 
with mid-range systems for performance of 
production HPC work. At the same time 
autonomous mid-range systems are needed to 
support the development of next-generation ar- 
chitectures and software by computer science 
groups. 

The panel perceives that there are imbalances in 
access to the pyramid of HPC resources (see the 
following table). The disciplinary NSF program 
offices have not been uniformly effective in 
responding to the need for a desktop environ- 



As discussed in the recommendations, dedicated mid- 
range systems are required not only for science and engineer- 
ing applications but also for research to improve HPC 
hardware and software, and for interactive usage. For 
science and science and engineering batch applications, net- 
works of workstations will likely develop into an alternative. 
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ment for their supported researchers, and there 
is serious under- investment in the mid-sized 
machines. The distribution of investment tends 
to be bimodal, to the disadvantage of mid-range 
systems. The incentive structures internal to the 
Foundation do not address this distortion. 
NSF's HPCC coordinating mechanism needs to 
address this distortion in a more direct manner. 



Computational infrastructure at NSF 

(FY92 $, M) 





Other NSF 


ASC 


Workstations . . . 


20.1 


3.2 


Small Parallel . . . 


2.1 


0.5 


Large Parallel . . . 


9.4 


3.2 


Mainframe 


9.1 


16.3 


Total 


40.8 


23.2 



CHALLENGE NO. 3: The right incentives to 
promote access to all three levels of the com- 
putational institution pyramid 

The third challenge is to encourage the continued 
broadening of the base of participation in HPC, 
both in terms of institutions and in terms of skill 
levels and disciplines. 

Lax Report incentives. At the time of the Lax 
report, relatively few people were interested in 
HPC; even fewer had access to supercomputers. 
Some users were fortunate to have contacts 
with someone at one of a few select government 
laboratories where computer resources were 
available. Most, however, were less fortunate 
and were forced to carry out their research on 
small departmental machines. This severely 
limited the research that could be carried out to 
problems that would "fit" into available resour- 
ces, NSF addressed this problem by concentrat- 
ing supercomputer resources in Centers; by this 
means those in the academic community most 
prepared and motivated were provided with ac- 
cess to machine cycles. 

Need for expanded scope of access. Now that 
these resources are available on a peer review 

O 
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basis to everyone no matter where they work, it 
is clear the research community cannot accept a 
return to the previous mode of operation.The 
high performance computing community has 
grown to depend on NSF to make the necessary 
resources available to continually upgrade the 
Supercomputer Centers in support of their com- 
putational science and engineering applications. 
NSF needs to broaden the base of participation 
in HPC through NSF program offices as well as 
through the Supercomputer Centers. There is no 
question that HPC has broken out of its original 
narrow group of privileged HPC specialists. 
The SuperQuest competition for high school stu- 
dents already demonstrates how quickly young 
people can master the effective use of HPC 
facilities. Other agencies, states, and private 
HPC centers are springing up, making major 
contributions not only to science but to K-12 
education and to regional economies. NSF's 
policies on expanding access and training must 
take advantage of the leverage these Supercom- 
puter Centers can provide. 

Allocation of HPC resources. There remains 
the question of the best way to allocate HPC 
resources. Should Supercomputer Centers con- 
tinue to be funded to allocate HPC cycles com- 
petitively, or should NSF depend on the 
"market" of funded investigators for allocation 
of HPC resources? This question gets at two 
other issues: (a) the future role of the Centers 
and (b) the best means for insuring adequate 
funding of workstations and other means of 
HPC access throughout the NSF. The Centers 
have peer review committees which allocate 
HPC resources on the basis of competitive 
project selection. The Panel believes these al- 
locations are fairly made and reflect solid 
professional evaluation of computational merit. 
The only remaining issue is whether there con- 
tinues to be a need for protected funding for 
HPC access in NSF, including access to shared 
Supercomputer Centers facilities? We believe 
strongly that there is such a need. The panel 
does have suggestions for broadening the sup- 
port for the remainder of the HPC pyramid; 
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these are articulated in the recommendations 
below. 

Education and training. A major requirement 
for education and training continues to exist. 
Even though most disciplines have been inocu- 
lated with successful uses of HPC (see Appen- 
dix D essays), and even though graduate student 
and postdoctoral uses of HPC resources is rising 
faster than faculty usage, only a minority of 
scientists have the training to allow them to 
overcome the initial barrier to proficiency, espe- 
cially in the use of MPP machines which re- 
quire a high level of computational 
sophistication for most problems. 

CHALLENGE NO. 4: How can NSF best 
create the intellectual and management 
leadership for the future of high performance 
computing in the U.S.? 

What relationships should NSF's activities in HPC 
have to the activities of other federal agencies? 
NSF is a major player. What role should NSF 
play within the scope of the nationally coordinated 
HPCC program and budget, as indicated in the fol- 
lowing chart? 

HPCC Agency Budgets 



Agency 


FY92 Funding ($, M) 


ARPA 


232.2 


NSF 


200.9 


DOE 


92.3 


NASA 


71.2 


HHS/NIH 


41.3 


DOC/NOAA 


9.8 


EPA 


5.0 


DOC/NIST 


2.1 



NSF leadership in HPCC. The voice of HPCC 
users needs to be more effectively felt in the na- 
tional program; NSF has the best contact with this 
community. NSF has played, and continues to 
play, a leadership role in the NREN program and 
the evolution of the Internet. Its initiative in creat- 
ing the "meta-center" concept establishes an NSF 
role in the sharing and coordination of resources 
(not only in NSF but in other cooperating agencies 
as well), and the concept can be usefully extended 
to cooperating facilities at state level and in 
private firms. The question is, does the current 
structure in CISE, the HPCC coordination office, 
the Supercomputer Centers, and the science and 
engineering directorates constitute the most 
favorable arrangement for that leadership? The 
panel does not attempt to suggest the best ways to 
manage the relationships among these important 
functions, but asks the NSF leadership to assure 
the level of attention and coordination required to 
implement the broad goals of this report. 

Networking, The third barrier is the need for net- 
work access with adequate bandwidth. For wide 
area networks, this is addressed in the NSF HPCC 
NREN strategy. In the future, NSF will focus its 
network subsidies on HPC applications and their 
supporting infrastructure, while support for basic 
Internet connectivity shifts to the research and 
education institutions. 11 



1 ! NREN is the National Research and Education Network, 
envisioned in the High Performance Computing Act of 1991 . 
NREN is not a network so much as it is a program of ac- 
tivities including the evolution of the Internet to serve the 
needs of HPC as well as other information activities. 
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RECOMMENDATIONS 

We have four sets of interdependent recommenda- 
tions for the National Science Board and the Foun- 
dation. The first implements a balanced pyramid 
of computing environments; each element sup- 
ports the others, and as priorities are applied the 
balance in the pyramid should be sustained. The 
second set addresses the essential research invest- 
ments and other steps to remove the obstacles to 
realizing the technologies of the pyramid and the 
barriers to the effective u&e of these environments. 
The third set addresses the institutional structure 
for the delivery of HPC capabilities, and consists 
itself of a pyramid. At the base of the institutional 
pyramid is the diverse array of investigators in 
their universities and other settings who use all the 
facilities at all levels of the pyramid. At the next 
level are departments and research groups devoted 
to specific areas of computer science or computa- 
tional science and engineering. Continuing up- 
ward are the NSF HPC Centers, which must 
continue to play a very important role, both as 
providers of the major resources of high capability 
computing systems and as aggregations of special- 
ized capability for all aspects of use and advance 
of high performance computing. At the apex, is 
the national teraflop facility, which we recom- 
mend as a multi-agency facility pushing the fron- 
tiers of high performance into the next decade. A 
final recommendation addresses the NSF's role at 
the national level and its relationship with the 
states in HPC. 

This report recommends significant expansion in 
NSF investments, both in at ^derating progress in 
high performance computing through computer 
and computational science research and in provid- 
ing the balanced pyramid of computing facilities 
to the science and engineering communities, but in 
total they do not exceed the Administration's 
stated intent to double the investments in HPCC 
during the next 5 years. We believe these invest- 
ments are not only justified, but are compatible 
with stated national plans, both in absolute amount 
and in their distribution. 



A. CENTRAL GOAL FOR 
NSF HPC POLICY 

Recommendation A-l: We strongly recommend 
that NSF build on its success in helping the U.S. 
achieve its preeminent world position in high per- 
formance computing by taking the lead, under 
OSTP guidance and in collaboration with ARPA, 
DoE and other agencies, to expand access to all 
levels of the rapidly evolving pyramid of high per 
formance computing for all sectors of the nation. 
The realization of this pyramid depends, of 
course, on rapid progress in the pyramid's tech- 
nologies. 

High performance computing is essential to the 
leading edge of U.S. research and development 
It will provide the intelligence and power that 
justifies the breadth of connectivity and access 
promised by the NREN and the National Infor- 
mation Infrastructure. The computational 
capability we envision includes not only the re- 
search capability for which NSF has special 
stewardship, but also includes a rapid expansion 
of capability in business and industry to use 
HPC profitably and the many operational uses 
of HPC in commercial and military activities. 

The panel is concerned that if the government 
fails to implement the planned HPCC invest- 
ments to support the National Information In- 
frastructure, the momentum of the U.S. 
industry, which blossomed in the first phase of 
the national effort, will be lost. Supercomputers 
are only a $2 billion industry, but an industry 
that provides critical tools for innovation across 
all areas of U.S. competitiveness, including 
pharmaceuticals, oil, aerospace, automotive, 
and others. The administration's planned new 
investment of $250 million in HPCC is fully jus- 
tified. Japanese competitors could easily close 
the gap in the HPC sectors in which the U.S. en- 
joys that lead; they are continuing to invest and 
could capture much of the market the U.S. 
government has been helping to create. 
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VISION OF THE HPC PYRAMID 

Recommendation A-2: At the apex of the HPC 
pyramid is a need for a national capability at the 
highest level of computing power the industry can 
support with both efficient software and hardware. 

A reasonable goal for the next 2-3 years would 
be the design, development, and realization of a 
national teraflop-class capability, subject to the 
effective implementation of Recommendation 
B-l and the development of effective software 
and computational tools for such a large 
machine. 12 Such a capability would provide a 
significant stimulus to commercial development 
of a prototype high-end commercial HPC sys- 
tem of the future. We believe the importance of 
NSF's mission in HPC justifies NSF initiating 
an interagency plan to make this investment, 
and further that NSF should propose to operate 
the facility in support of national goals in 
science and technology. For budgetary and inter- 
agency collaboration reasons OSTP should in- 
voke a FCCSET project to establish such a 
capability on a government-wide basis with 
multi-agency funding and usage. 

If development begins in 1995 or 1996, a 
reasonable guess at the cost of a teraflop 
machine is $50/megaflop for delivery in 1997 
to 1998. If so, $50 million a year might buy one 



12 Some panel members have reservations about the urgency 
of this recommendation, are pessimistic about the likelihood 
of realizing the effective performance in applications, or are 
concerned about the possible opportunity cost to NSF of 
such a large project. The majority notes that the recommen- 
dation is intended to drive solutions to those architectural 
and software problems. Intel's Paragon machine is on the 
market today with 0.3 Teraflops peak speed, but without the 
support to deliver that speed in most applications. The panel 
also recommends a multi-agency federal effort. NSFs share 
of cost and role in managing such a project are left to a 
proposed FCCSET review. 
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such machine per year. Development cost 
would be substantial, perhaps in excess of the 
production cost of one machine; alt Hough it is 
not clear to what extent government support 
would be required, this is a further reason to 
suggest a multi-agency program. 14 Support 
costs would also be additional, but one can as- 
sume that one or more of the NSF Supercom- 
puter Centers could host such a facility with 
something like the current staff . 

Such a nationally shared machine, or machines, 
must be open to competitive merit-evaluated 
proposals for science and engineering computa- 
tion, although it could share this mission of 
responding to the community's research 
priorities with mission-directed work of the 
sponsoring agencies. The investment is jus- 
tified by (a) the existence of problems whose 
solution awaits a teraflop machine, (b) the im- 
portance of driving the HPC industry's innova- 
tion rate, (c) the need for early and concrete 
experience with the scalability of software en- 
vironments to higher speeds and larger arrays of 
processors, since software development time is 
the limiting factor to hardware acceptance in the 
market. 



1 J The cost estimates in this report cannot be much more than 
informed guesses. We have assumed a cost of $50/megaflop 
for purchase of a one teraflop machine in 1997 or 1998. We 
suspect that this cost might be reached earlier, say in 1995 or 
1996 in a mid-range machine, because a tightly-coupled mas- 
sively parallel machine may have costs rising more than 
linearly with the number of processors, overcoming the scale 
economies that might make the cost rise less than linearly. 
The cost estimates in recommendations A2-4 are intended to 
indicate that scale of investment we recommend is not in- 
compatible with the published pllans of the administration 
for investment in HPCC in the next 5 years, and further that 
roughly equal levels of incremental expenditures in the three 
levels of the HPC pyramid could produce the balance among 
these levels that we recommend. 

14 Thc Departments of Energy and Defense and NASA might 
share a major portion of the development cost and might also 
acquire such machines in the future as well. 




Recommendation A-3: Over a period of 5 years 
the research universities should be assisted to ac- 
quire mid-range machines. 

This will bring a rapid expansion in access to 
very robust capability, reducing pressure on the 
Supercomputer Centers' largest facilities, and al- 
lowing the variety of vendor solutions to be ex- 
ercised extensively. If the new MPP 
architectures prove robust, usable, and scalable, 
these institutions will be able to grow the 
capacity of such system in proportion to need 
and with whatever incremental resources are 
available. This capability is also needed to pro- 
vide testbeds for computer and computational 
science research and testing. 

These mid-sized machines are the underfunded 
element today — less than 5% of NSF's FY92 
HPC budget is devoted to their acquisition. 
They are needed for both demanding science 
and engineering problems that do not require 
the very maximum in computing capacity, and 
importantly for use by the computer science and 
computational mathematics community in ad- 
dressing the architectural, software, and algo- 
rithmic issues that are the primary barriers to 
progress with MPP architectures. 15 

Engineering is also a key candidate for their 
use. There are 1050 University-Industry Re- 
search Centers in the U.S. Those UIRCs that 
are properly equipped with computational 
facilities can increase the coupling with in- 
dustrial computation, adding greatly to what the 
NSF HPC Supercomputer Centers are doing. 
Many engineering applications, such as robotics 
research, require "real time" interactive com- 
putation which is incompatible with the batch 
environment on the highest performance 
machines. 



The development of prototypes of architectures and operat- 
ing systems for parallel computation requires access to a 
machine whose hardware and software can be experimental- 
ly modified. This research often cannot be done on machines 
dediicated to full time production. 
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If we assume a cost in three or four years of 
$50/megaflop for mid-sized MPP machines, an 
annual expenditure of $10 million would fund 
the annual acquisition of one hundred 2 
Gigaflop (peak) computers. Support costs for 
users would be additional. 

Recommendation A-4: We recommend that NSF 
double the current annual level of investment ($22 
million) in scientific and engineering workstations 
for its 20,000 principal investigators. 

Many researchers strongly prefer the new high 
performance workstations that are under their 
control and find them adequate to meet many of 
their initial needs. Those without access to the 
new workstations may apply to use remote ac- 
cess to a supercomputer in a Center, but often 
they do not need all the I/O and other 
capabilities of the large shared facilities. NSF 
needs a strategy to off-load work not requiring 
the highest level machines in the Centers. The 
justification is not economy of scale, but 
economy of talent and time. 

When the Lax report was written a 160 Mflop 
peak Cray 1 was a high performance supercom- 
puter. Within 4 or 5 years workstations deliver- 
ing up to 400 megaflops costing no more than 
$15,000 to $20,000 should be widely available. 
For education and a large fraction of the com- 
putational needs of science and engineering, 
these facilities will be adequate. However, once 
visualization of computational output becomes 
routinely required they will be ubiquitously 
needed. With the rapid pace of improvement, 
the useful lifetimes of workstations are decreas- 
ing rapidly; they often cannot cope with the 
latest software. Researchers face escalating 
costs to upgrade their computers. NSF supports 
perhaps some 20,000 principal investigators. 
Equipping an additional 10 percent of this num- 
ber each year (2,000 machines) at $20,000 each 
requires an incremental $20 million. Recom- 
mendation B-3 addresses how this investment 
might be managed. 

Recommendation A-5: We recommend that NSF 
expand its New Technologies program to support 
expanded testing of the practicality of new parallel 
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configurations for HPC applications. 16 For ex- 
ample, networks of workstations may meet a sig- 
nificant part of midrange HPC science and 
engineering applications. As progress is made in 
the development of this and other technologies, ex- 
perimental use of the new configurations should 
be encouraged. A significant supplement to HPC 
applications research capacity can be had with 
minimal additional cost if such collections of 
workstations prove practical and efficient. 

There have already been sufficient experiments 
with use of distributed file systems and loosely 
coupled workstations to encourage the belief 
than many compute-intensive problems are 
amenable to this approach. For those problems 
that do not suffer from the latency inherent in 
this approach the incremental costs can be very 
low indeed, for the problems run in background 
and at times the workstations are otherwise un- 
engaged. There are those who strongly believe 
that in combination with object-oriented 
programming this approach can create a revolu- 
tion in software and algorithm sharing as well 
as more economical machine cycles. 17 

EL RECOMMENDATIONS TO 
IMPLEMENT THESE GOALS 

REMOVING BARRIERS TO HPC 
TECHNICAL PROGRESS AND HPC USAGE 

Recommendation B-l: To accelerate progress in 
developing the HPC technology needed by users, 
NSF should create, in CISE, a challenge program 
in computer science with grant size and equipment 
access sufficient to support the systems and algo- 



16 Today NSF CISE has a "new technologies" program that 
co-funds with disciplinary program offices perhaps 50 
projects/yr. This program is in the division that funds the 
Centers, but is focused on projects which can ultimately 
benefit all users of parallel systems. This program funds per- 
haps 15 methods and tools projects annually, in addition to 
those co-funded with science programs. 

17 MITRE Corporation, among others, is pursuing this vision. 
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rithm research needed for more rapid progress in 
HPC. The Supercomputer Centers, in collabora- 
tion with hardware and software vendors, can pro- 
vide test platforms for much of this work. 
Recommendation A-3 provides the hardware sup- 
port required for initial development of 
prototypes. 

There is consensus that the absence of sufficient 
funding for systems and algorithms work which 
is not mission-oriented is the primary barrier to 
lower cost, more widely accessible, and more 
usable massively parallel systems. This work, 
including bringing the most promising ideas to 
prototype stage for effective transfer to the HPC 
industry, would address the most significant bar- 
riers to the ultimate penetration of parallel ar- 
chitectures in workstations. Advances on the 
horizon that could be accelerated include more 
advanced network interface architectures and 
operating systems technologies to provide low 
overhead communications in collections of 
workstations, and advances in algorithms and 
software for distributed databases of massive 
size. Computer science has made, and continues 
to make, important contributions to both hard 
and soft parallel machine technology, and has ef- 
fectively transferred these ideas to the industry. 

Two problems impede the full contribution of 
computer science to rapid advance in MPP 
development; grant sizes in the discipline are 
typically too small to allow enough con- 
centrated effort to build and test prototypes, and 
too few computer science departments have ac- 
cess to a mid-sized machine on which systems 
development can be done. 

The Board should ask for a proposal from CISE 
to effectively mobilize the best computer 
science and computational mathematics talent 
to addressing the solution of these problems in 
the areas of both improved operating systems, 
architectures, compilers, and algorithms for ex- 
isting systems as well as research in next- 
generation systems. We recommend 
establishing a number of major projects, with 
higher levels of annual funding than is typ xal 
in Computer Science, and assured duration o ? 




up to five years, for a total annual incremental 
investment of $10 million. We recommend that 
this challenge fund be managed by CISE, and 
be accessible to all disciplinary program offices 
who wish to forward team proposals for add-on 
funding in response to specific proposals from 
the community. 

Recommendation B-2: A significant barrier to 
rapid progress in the application of HPC lies in for- 
mulating a computational strategy to solve a prob- 
lem. In response to Challenge 1 above, NSF 
should focus attention, both through CISE and 
through its disciplinary program offices, on sup- 
port for the design and development of computa- 
tional techniques, algorithmic methodology, and 
mathematical, physical and engineering models to 
make efficient use of the machines. 

V/ithout such work in both theoretical and ap- 
plied areas of numerical analysis, applied mathe- 
matics, and computational algorithms, the full 
benefit of advances in architecture and systems 
software will not be realized. In particular, sig- 
nificantly increased funding of collaborative 
and individual state-of-the-art methodology is 
warranted, and is crucial to the success of high 
performance computing. Some of this can be 
done through the individual directorates with 
funds supplemented by HPCC funds; the Grand 
Challenge Applications Group awards are a 
good first step. 

Recommendation B-3: We recommend NSF set 
up an agency-wide task force to develop a way to 
ameliorate the imbalance in the HPC pyramid - 
the under-investment in the emerging mid-range 
scalable, parallel computers and the inequality of 
access to stand-alone (but potentially networked) 
workstations in the disciplines. This implementa- 
tion plan should involve a combination of funding 
by disciplinary program offices and some form of 
more centralized allocation of NSF resources. 

Some directorates have "infrastructure" 
programs; others do not. Still others fund 
workstations until they reach the "target" set by 
the HPCC coordination office. We believe that 
individual disciplinary program managers 
should consider it their responsibility to fund 
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purchase of workstations out of their equipment 
funds. But we recognize that these funds need 
to be supplemented by HPCC funds. CISE has 
an office which co-funds interdisciplinary ap- 
plications of HPC workstations. We believe this 
office may require more budgetary authority 
than it now enjoys, to ensure the proper balance 
of program and CISE budgets for workstations. 

Scientific value must be a primary criterion for 
resource allocation. It would be unwise to sup- 
port mediocre projects just because they require 
supercomputers. The strategy of application ap- 
proval will depend very heavily on funding 
scenarios. If sufficient HPCC funds are made 
available to individual programs for computer 
usage, then the Supercomputer Centers should 
be reserved for applications that cannot be car- 
ried out elsewhere, with particular priority to 
novel applications. If individual science 
programs continue to be underfunded relative to 
large centers, the Supercomputer Centers may 
be forced into a role of supporting less novel or 
demanding computing applications. Under 
these circumstances, less stringent funding 
criteria should be applied. 

C. THE NSF SUPERCOMPUTER 
CENTERS 

Recommendation C-l: The Supercomputer 
Centers should be retained and their missions, as 
they have evolved since the Lax Report, should be 
reaffirmed. However, the NSF HPC effort now 
embraces a variety of institutions and programs - 
HPC Centers, Engineering Research Centers 
(ERC) and Science and Technology Centers 
(STC) devoted to HPC research, and disciplinary 
investments in computer and computational 
science and applied mathematics - all of which are 
essential elements of the HPC effort needed for 
the next decade. NSF plays a primary but not 
necessarily dominant role in each of them (see Fig- 
ure 4 of Appendix D). Furthermore, HPC institu- 
tions outside the NSF orbit also contribute to the 
goals for which the NSF Supercomputer Centers 
are chartered. Thus we ask the Board to recognize 
that the overall structure of the HPC program at 
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NSF will have more institutional diversity, more 
flexibility, and more interdependence with other 
agencies and private institutions than in the early 
years of the HPC initiative. 

We anticipate an evolution, which has already 
begun, in which the NSF Supercomputer 
Centers increasingly broaden their base of sup- 
port, and NSF expands its support in collabora- 
tion with other institutional settings for HPC. 
Center-like groups, especially NSF S&T 
Centers, are an important instrument for focus- 
ing on solving barriers to HPC, although they 
do not provide HPC resources to users. An ex- 
cellent example is the multi-institutional Center 
for Research in Parallel Computation at Rice 
University, which is supported at about $4M/yr, 
with additional support from ARPA. Another ex- 
ample is the Center for Computer Graphics and 
Scientific Visualization, an S&T Center award 
to University of Utah with participation of 
University of N. Carolina, Brown, Caltech, and 
Cornell. Still another example is the Discrete 
Mathematics and Computational Science Center 
(DIM ACS) at Rutgers and Princeton. These 
centers fill important roles today, and the ERC 
and S&T Center structures provide a necessary 
addition to the Supercomputer Centers for in- 
stitutionalizing the programmatic work required 
for HPC. 

The NSF should continue its current practice of 
encouraging HPC Center collaboration, both 
with one another and with other entities 
engaged in HPC work. The division of the sup- 
port budget into one component committed to 
the Supercomputer Centers and another for 
multi-center activities is a useful management 
tool, even though it may have the effect of 
reducing competition among Supercomputer 
Centers. The National Consortium for HPC 
(NCHPC), formed by NSF and ARPA is a wel- 
come measure as well. 

Recommendation C-2: The current situation in 
HPC is more exciting, more turbulent, and more 
filled with promise of really big benefits to the na- 
ti i than at any time since the Lax report; this is 
not the time to "sunset" 'i successful, changing 
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venture, of which Supercomputer Centers remain 
an essential part. Furthermore, we also recom- 
mend against an open recompeti ion of the four 
Supercomputer Centers at this time, favoring in- 
stead periodic performance evaluation and com- 
petition for some elements of their activities, both 
among the Centers themselves and when ap- 
propriate with other HPC Centers such as those 
operated by states (see Recommendation D-l). 

Continuing evaluation of each Center's perfor- 
mance, as well as the performance of the overall 
program, is, of course, an essential part of good 
management of the Supercomputer Centers pro- 
gram. Such evaluations must take place on a 
regular basis in order to develop a sound basis 
for adjustments in support levels, to provide in- 
centives for quality performance and to recog- 
nize the need to encourage other institutions 
such as S&T Centers that are attacking HPC bar- 
riers and state-based centers with attractive 
programs in education and training. While 
recompetition of existing Supercomputer 
Centers does not appear to be appropriate at this 
time, if regular review of the Centers and the 
Centers program identifies shortcomings in a 
Center or the total program, a recompetition of 
that element of the program should be initiated. 

Supercomputer Centers are highly leveraged 
through investments by industry, vendors, and 
states. This diversification of support impedes 
unilateral action by NSF, since the Centers 1 
other sponsors must be < onsulted before 
decisions important to the Center are made. 18 It 
also suggests that the issue of recompetition 
may, in future, become moot as the formal 
designation "NSF HPC Center" erodes in sig- 



18 Each year each center gets a cooperative agreement level 
which is negotiated. Each center gets about $14M; about 
1 5% is flexible. NSF centers have also received help from 
ARPA to buy new MPP machines. Most of the Centers have 
important outside sources of support, which imply obliga- 
tions NSF must respect, such as the Cornell Center relation- 
ship with IBM and the San Diego Center's activities with the 
State of California 
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nificance. There is a form of recompetition al- 
ready in place; the Centers compete for support 
for new machine acquisition and for ..oles in 
multi-center projects. 

Recommendation C-3: The NSF should con- 
tinue to provide funding to support the Supercom- 
puter Centers' HPC capacity. Any distortion in the 
uses of the computing pyramid that result from 
this dedicated funding are best offset by the recom- 
mendations we make for other elements in the 
pyramid. Provision to scientists and engineers of 
access to leading edge supercomputer resources 
will continue to be a primary purpose of the 
Centers, but it is a means to a broader mission; to 
foster rapid progress in the use of HPC by scien- 
tists and engineers, to accelerate progress in 
usability and economy of HPC and to diffuse HPC 
capability throughout the technical community, in- 
cluding industry. The following additional com- 
ponents of the Center missions should be 
affirmed: 

♦ Supporting computational science, by re- 
search and demonstration in the solution of sig- 
nificant science and engineering problems. 

♦ Fostering interdisciplinary collaboration - 
across sciences and between sciences and com- 
putational science and computer science - as in the 
Grand Challenge programs. 

♦ Prototyping and evaluating software, new ar- 
chitectures, and the uses of high speed data com- 
munications in collaboration with three groups: 
computer and computational scientists, discipli- 
nary scientists exploiting HPC resources, the HPC 
industry, and business firms exploring expanded 
use of HPC. 

♦ Training and education, from post-docs and 
faculty specialists to introduction of less ex- 
perienced researchers to HPC methods, to col- 
laboration with state and regional HPC centers 
working with high schools, community colleges, 
colleges, and universities. 

The role of a Supercomputer Center should, 
therefore, continue to be primarily one of a 
facilitator, pursuing the goals just listed by 
making the hardware and human resources 
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available to computational scientists, who 
themselves are intellectual leaders. In this 
way the Centers will participate in leadership 
but will not necessarily be its primary source. 
With certain notable exceptions, intellectual 
leadership i-n computational science has come 
from scientists around the country who have 
at times used the resources available at the 
Centers. This situation is unlikely to change 
nor should it change. It would be unrealistic 
to place this type of demand on the Super- 
computer Centers and it would certainly not 
be in the successful tradition of American 
science. 

The Supercomputer Centers facilitate interdis- 
ciplinary collaborations because they support 
users from a variety of disciplines, and are 
aware of their particular strengths. The 
Centers have been deeply involved in nucleat- 
ing Grand* Challenge teams, and particularly 
in reaching out to bring computer scientists 
together with computational scientists. 
Visualization, for example, is no longer just 
in the realm of the computational scientist; 
experimentalists use the same tools for 
designing and simulating experiments in ad- 
vance of actual data generation. This com- 
mon ground should not be separated from the 
enabling technologies which have made this 
work possible. Rather high performance 
computing and the new science it has enabled 
have seeded advances that would not have 
happened any other way. 

ALLOCATION OF CENTER HPC 
RESOURCES TO INVESTIGATORS 

Recommendation C-4: The NSF should review 
the administrative procedures used to allocate Cen- 
ter resources, and the relationship of this process 
to the initial funding of the research by the discipli 
nary program offices, to ensure that the burden on 
scientists applying for research support is mini- 
mized, when that research also requires access to 
the facilities of the Centers, or perhaps access to 
other elements of the HPC pyramid that will be es- 
tablished pursuant to our recommendations. How- 
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ever we believe the NSF should continue to pro- 
vide HPC resources to the research community 
through allocation committees that evaluate com- 
petitively proposals for use of Center resources. 19 

At the present time, the allocation of resources 
in the Supercomputer Centers for aJl users is 
handled by requiring principal investigators to 
submit annual proposals to a specified Center 
for access to specific equipment. The NSF 
should not require a duplicate peer review of 
the substantive scientific merit of the proposed 
scientific investigation, first by disciplinary pro- 
gram offices, and then again by the Center Al- 
location Committees. For this reason, it is 
proposed that the allocation of supercomputer 
time be combined with the allocation of re- 
search funds to the investigator. 

Although this panel is not in a position to give 
administrative details of such a procedure, it is 
suggested that requests for computer time be at- 
tached to the original regular NSF proposal, 
with (a) experts in computational science in- 
cluded among peer reviewers, or, (b) that por- 
tion of the proposal be reviewed in parallel by a 
peer review established by the Centers. In either 
case only one set of peer reviewers should 
evaluate scientific merits, and only one set of 
reviewers should determine that the research 
task is being formulated properly for use of 
HPC resources. 

Second, we recommend that the Centers collec- 
tively establish the review and allocation 
mechanism, sc that while investigators might 



l9 For NSF funded investigators, allocation committees at Su- 
percomputer Centers should evaluate requests for HPC 
resources only on the appropriateness of the computational 
plans, choice of machine, and amount of resource requested. 
Centers should rely on disciplinary program office deter- 
minations of scientific merit, based on their peer review. In 
this way a two level review of the merits of the science is 
avoided. A further simplification might be for the application 
for computer time at the Centers to be included in the 
original disciplinary proposal, and forwarded to the Centers 
when the proposal is approved. For non-NSF funded inves- 
tigators an alternative form of peer review of the research is 
required. 
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express a preference for a particular computer 
or Center for their work, all Centers facilities 
would be in the pool from which each inves- 
tigator receives allocations. 

We recognize, of course, that the specific alloca- 
tion of machine time often cannot be made at 
the time of the original proposal for NSF re- 
search support, since in some cases the work 
has not progressed to the point that the mathe- 
matical approach, algorithms, etc., are available 
for Center experts to evaluate and translate into 
estimates of machine time. Nor is the demand 
function for facilities known at that time. 

EDUCATION AND TRAINING 

Recommendation C-5: The NSF should give 
strong emphasis to its education mission in KPC, 
and should actively seek collaboration with state- 
sponsored and other HPC centers not supported 
primarily on NSF funding. Supercomputing 
regional affiliates should be candidates for NSF 
support, with education as a key role. HPC will 
also figure in the Administration's industrial exten- 
sion program, in which the states have the primary 
operational role. 

The serious difficulties associated with the use 
of parallel computers pose a new training bur- 
den. In the past it was expected that individual 
investigators would port their code to new com- 
puters and this could usually be done with 
limited effort. This is no longer the case. The 
Supercomputer Centers should see their future 
mission as providing direct aid to the rewriting 
of code for parallel processors. 

Computational science is proving to be an effec- 
tive way to generate new knowledge. As part of 
its basic mission, NSF needs to teach scientists, 
engineers, mathematicians, and even computer 
scientists how high performance computing can 
be used to produce new scientific results. The 
role of the Supercomputer Centers is critical to 
such a mission since the Centers have expertise 
on existing hardware and software systems, 
modelling, and algorithms, as well as 
knowledge of useful high performance comput- 




ing application packages, awareness of trends in 
high performance computing and requisite staff. 

D. NSF AND THE NATIONAL 
HPC EFFORT; RELATIONSHIPS 
WITH STATES 

Recommendation D-l: We recommend that the 
National Science Board urge OSTP to establish an 
advisory committee representing the states, HPC 
users, NSF Supercomputer Centers, computer 
manufacturers, computer and computational scien- 
tists (similar to the Federal Networking Council's 
Advisory Committee), which should report to 
HPCCIT. A particularly important role for this 
body would be to facilitate state-federal planning 
related to high performance computing. 

Congress required advisory committee report- 
ing to the PMES, but the committee has not yet 
been implemented. Tue committee we propose 
would provide policy level advice and coordina- 
tion with the states. The main components of 
HPCC are networking and HPC, although the 
networks seem to be receiving priority atten- 
tion. The Panel believes it is important to con- 
tinue to emphasize the importance of ensuring 
adequate compute power in the network to sup- 
port the National Information Infrastructure ap- 
plications. We also believe that as participation 
in HPC continues to broaden through initiatives 
by the states and by industry, the NSF (and 



other federal agencies) should encourage their 
collaboration in the national effort. 

The Coalition of Academic Supercomputer 
Centers (CASC) was founded in 1989 to pro- 
vide a forum to encourage support for high per- 
formance computing and networking. Unlike 
the FCCSET task force, CASC is dependent on 
others to bring the money to support high per- 
formance computing - usually their own State 
government or university. The result is a valu- 
able discussion group for exchanging informa- 
tion and developing a common agenda and 
CASC should be encouraged. However, CASC 
is not a substitute for a more formal federal ad- 
visory body. 

This recommendation is consistent with a recent 
Carnegie Commission Report entitled "Science, 
Technology and the States in America's Third 
Century," which recommends the creation of a 
system of joint advisory and consultative bodies 
to foster federal-state exchanges and to create a 
partnership in policy development, especially 
for construction of national information in- 
frastructure and provision of services based on 
it. Because of the importance of high perfor- 
mance computing to future economic develop- 
ment, we need a new balance of cooperation 
between federal and state government in this 
area, as in a number of others. 
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NSF AND HIGH PERFORMANCE COMPUTING: 
HISTORY AND ORIGIN OF THIS STUDY 



Introduction 

This report of the Blue Ribbon Panel on High Per- 
formance Computing follows a number of 
separate, but related, activities in this area by the 
NSF, the computational science community, and 
the Federal Government in general acting in con- 
cert through the Federal Coordinating Committee 
on Science, Engineering, and Technology. The 
Panel's findings and recommendations must be 
viewed within this broad context of HPC. This 
section provides a description of the way in which 
the panel has conducted its work and a brief over- 
view of the preceding accomplishments which 
were used as the starting point for the Panel's 
deliberations. 

The Origin of the Present Panel and Charter 

Following the renewal of four of the five NSF Su- 
percomputer Centers in 1990, the National 
Science Board (NSB) maintained an interest in the 
Centers' operations and activities. Given the na- 
tional scope of the Centers, and the possible im- 
plications for them contained in the HPCC Act of 
1992, the NSB commissioned the formation of a 
blue ribbon panel to investigate the future changes 
in the overall scientific environment due the rapid 
advances occurring in the field of computers and 
scientific computing. The panel was instructed to 
investigate the way science will be practiced in the 
next decade, and recommend an appropriate role 
for NSF to enable research in the overall comput- 
ing environment of the future. The panel consists 
of representatives from a wide spectrum of the 
computer and computational science communities 
in industry and academia. The role expected of the 
Panel is reflected by its Charter : 

A. Assess the contributions of high perfor- 
mance computing to scientific and engineer- 
ing research and education, including 
ancillary benefits, such as the stimulus to 



the pace of innovation in U.S. industries and 
the public sector. 

B. Project what hardware, software and com- 
munication resources may be available in 
the next five to ten years to further these ad- 
vances and identify elements that may be 
particularly important to the development of 
HPC. 

C. Assess the variety of institutional forms 
through which access to high performance 
computingvinay be gained including funding 
of equipment acquisition, shared access 
through local centers, and shared access 
through broad band telecommunications. 

D. Project sources, other than NSF, for support 
of such capabilities, and potential coopera- 
tive relationships with: states, private sector, 
other federal agencies, and international 
programs. 

E. Identify barriers to the development of more 
efficient, usable, and powerful means for ap- 
plying high performance computing, and 
means for overcoming them. 

F. Provide recommendations to help guide the 
development of NSF's participation in super- 
computing and its relation to the federal in- 
teragency High Performance Computing 
and Communications Program. 

G. Recommend policies and managerial struc- 
tures needed to achieve NSF program goals, 
including clarification of the peer review 
procedures and suggesting appropriate 
processes and mechanisms to assess pro- 
gram effectiveness necessary for insuring 
the highest quality science and engineering 
research. 

At its first meeting in January 1993, the panel ap- 
proved its Charter, and established a scope of 
work which would allow a final report to be 
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presented to the NSB in Summer 1993. A large 
number of questions were raised amplifying the 
Charter's directions. Prior to its second meeting 
in March 1993 the Panel solicited input from the 
national research community; a response to the fol- 
lowing four questions was requested. 

♦ How would you project the emerging high 
performance computing environment and 
market forces over the next five years and the 
implications for change in the way scientists 
and engineers will conduct R&D, design and 
production modeling? 

♦ What do you see as the largest barriers to the 
effective use of these emergent technologies 
by scientists and engineers and what efforts 
will be needed to remove these barriers? 
What is the proper role of government, and, 
in particular, the NSF to foster progress? 

♦ To what extent do you believe there is a fu- 
ture role for government-supported supercom- 
puter centers? What role should NSF play in 
this spectrum of capabilities? 

♦ To what extent should NSF use its resources 
to encourage use of high performance com- 
puting in commercial industrial applications 
through collaboration between high perfor- 
mance computing centers, academic users 
and industrial groups? 

Over fifty responses were received and were con- 
sidered and discussed by the Panel at its March 
meeting. The Panel also received presentations, 
based on these questions, from vendors of high 
performance computing equipment and repre- 
sentatives from non-NSF supercomputer centers. 

NSF's Early Participation in High Performance 
Computing 

Although the National Science Foundation is now 
a major partner in the nation's high performance 
computing effort, this was not always the case. In 
the early 1970s the NSF ceased its support of cam- 
pus computing centers, and by the mid-1970s 
there were no "supercomputers" on any campus 
available to the academic community. Certainly 
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computers of this capability were available 
through other government agency (DoE and 
NASA) laboratories, but NSF did not play a role, 
and hence many of its academic researchers did 
not have the ability to perform computational re- 
search on anything other than a departmental mini- 
computer, thereby limiting the scope of their 
research. 

This lack of NSF participation in the high perfor- 
mance computing environment began to be noted 
in the early 1980s with the publication of a grow- 
ing number of reports on the subject. A report to 
the NSF Division of Physics Advisory Committee 
in March 1981 entitled "Prospectus for Computa- 
tional Physics", edited by W. Press, identified a 
"crisis" in computational physics, and recom- 
mended support for facilities. Subsequent to this 
report a joint agency study, "Large Scale Comput- 
ing in Science and Engineering", edited by P. Lax, 
appeared in December 1982 and acted as the 
catalyst for NSF's reemergence in the support of 
high performance computing. The Lax Report 
presented four recommendations for a government- 
wide program: 

♦ Increased access to regularly upgraded super- 
computing facilities via high bandwidth net- 
works 

♦ Increased research in computational mathe- 
matics, software, and algorithms 

♦ Training of personnel in scientific computing 

♦ R&D of new supercomputer systems 

The key suggestions contained in the Lax Report 
were studied by an internal NSF working group, 
and the findings were issued in July 1983 as "A 
National Computing Environment for Academic 
Research", a report edited by M. Bardon and K. 
Curtis. The report studied NSF supported 
scientists' needs for academic computing, and 
validated the conclusions of the Lax Report for the 
NSF supported research community. The findings 
of Bardon/Curtis reformulated the four recommen- 
dations of the Lax Report into a six point im- 
plementation plan for the NSF. Part of this action 
plan was a recommendation to establish ten 
academic supercomputer centers. 
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The immediate NSF response was to set up a 
means for academic researchers to have access, at 
existing sites, to the most powerful computers of 
the day. This was an interim step prior to a 
solicitation for the formation of academic super- 
computer centers directly supported by the NSF. 
By 1987, five NSF Supercomputer Centers had 
been established, and all had completed at least 
one year of operation. 

During this phase the Centers were essentially iso- 
lated "islands of supercomputing" whose role was 
to provide supercomputer access to the academic 
community. This aspect of the Centers' activities 
has changed considerably. The NSF concept of 
the Centers' activities was mandated to be much 
broader, as indicated by the Center's original ob- 
jectives: 

♦ Access to state of the art supercomputers 

♦ Training of computational scientists and en- 
gineers 

♦ Stimulate the U.S. supercomputer industry 

♦ Nurture computational science and engi- 
neering 

♦ Encourage collaboration among researchers 
in academia, industry and government 

In 1988-1989 NSF conducted a review to deter- 
mine whether support was justified beyond 1990. 
In developing proposals, the Centers were advised 
to increase their scope of responsibilities. Quoting 
from the solicitation: 

"To insure the long term health and value of a 
supercomputer. center, an intellectual environ- 
ment, as well as first class service, is neces- 
sary. Centers should identify an intellectual 
component and research agenda". 

In 1989 NSF approved continuation through 1995 
of the Cornell Theory Center, the National Center 
for Supercomputing Applications, the Pittsburgh 
Supercomputing Center, and the San Diego Super- 
computer Center. Support for the John von 
Neumann Center was not continued. 
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The Federal High Performance Computing and 
Communications Initiative 

At the same time the NSF Supercomputer Centers 
were beginning the early phases of their opera- 
tions the Federal Coordinating Committee for 
Science, Engineering, and Technology began a 
study in 1987 on the status and direction of high 
performance computing, and its relationship to 
federal research and development. The results 
were "A Research and Development Strategy for 
High Performance Computing" issued by the Of- 
fice of Science and Technology Policy (OSTP) in 
November 1987, followed in September 1989 by 
another OSTP document "The Federal High Per- 
formance Computing Program". These two 
reports set the framework for the inter-governmen- 
tal agency cooperation on high performance 
computing which led to the High Performance 
Computing and Communications (HPCC) Act of 
1991. 

HPCC focuses on four integrated components* of 
computer research and applications which very 
closely echo the Lax Report conclusions: 

♦ High Performance Computing Systems - tech- 
nology development for scalable parallel sys- 
tems to achieve teraflop speed 

♦ Advanced Software Technology and Algo- 
rithms - generic software and algorithm 
development to support Grand Challenge 
projects, including early access to production 
scalable systems 

♦ National Research and Education Network - 
to further develop the national network and 
networking tools, and to support the research 
and development of gigabit networks 

♦ Basic Research and Human Resources - to 
support individual investigator research on 
fundamental and novel science and to initiate 
activities to significantly increase the pool of 
trained personnel 



* At the time of writing this Report, a fifth component, en- 
titled, Information Infrastructure, Technology, and Applica- 
tions is being defined for inclusion in the HPCC Program 



With this common structure across all the par- 
ticipating agencies, the Program outlines each 
agency's roles and responsibilities. NSF is the 
lead agency in the National Research and Educa- 
tion Network, and has major roles in Advanced 
Software Technology and Algorithms, and in 
Basic Research and Human Resources. 

The Sugar Report 

After the renewal of the four NSF Supercomputer 
Centers the NSF Division of Advanced Scientific 
computing recognized that the computing environ- 
ment within the nation had changed considerably 
from that which existed at the inception of the 
Centers Program, The Division's Advisory Com- 
mittee was asked to survey the future possibilities 
for high performance computing, and report back 
to the Division. Two workshops were held in the 
Fall of 1991 and Spring of 1992. Thirty one par- 
ticipants with expertise in computational science, 
computer science and the operation of major super- 
computer centers were involved. 

The final report, edited by R. Sugar of the U. of 
California at Santa Barbara, recommended future 



directions for the Supercomputer Centers Program 
which would "enable it to take advantage of these 
(HPCC) opportunities and to meet its respon- 
sibilities to the national research community". 
The committee's recommendations can be sum- 
marized as: 

♦ Decisions and planning by the Division need 
to be made in a programmatic way, rather 
than on an individual Center by Center basis - 
the meta-center concept provides a vehicle 
for this management capability which goes 
beyond the existing Centers. 

♦ Access to stable computing platforms (cur- 
rently vector supercomputers) needs to be 
augmented by access to state of the art tech- 
nology (currently massively parallel com- 
puters) - but, the former cannot be sacrificed 
to provide the latter 

♦ The Supercomputer Centers can be focal 
points for enabling collaborative efforts 
across many communities - computational 
and computer science, private sector and 
academia, vendors and academia. 
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Appendix C 



TECHNOLOGY TRENDS and BARRIERS 
to FURTHER PROGRESS 



BACKGROUND: 

What is the state of the HPC industry here and 
abroad? What is its prognosis? 

The high performance computer industry is in a 
state of turmoil, excitement and opportunity. On 
the one hand, the vector multiprocessors manufac- 
tured by many firms, large and small, have con- 
tinued to improve in capability over the years. 
These systems are now quite mature, as measured 
by the fact that delivered performance is a sig- 
nificant fraction of the theoretical peak perfor- 
mance of the hardware, and are still the preferred 
platform for many computational scientists and en- 
gineers. They are the workhorses of high perfor- 
mance computing today and will continue in that 
role even as alternatives mature. 

On the other hand, dramatic improvements in 
microprocessor performance and advances in 
microprocessor-based parallel architectures have 
resulted in "massively parallel" systems that offer 
the potential for high performance at lower cost. 20 
For example, $10 million in 1993 buys over 40 
gigaflops peak processing power in a multicom- 
puter but only 5 gigaflops in a vector multiproces- 
sor. As a result, increasing numbers of 
computational scientists and engineers are turning 
to the highly parallel systems manufactured by 
companies such as Cray Research Inc., IBM, Intel, 



20 Note that the higher cost of vector machines is partly 
caused by their extensive use of static memory chips for 
main memory and the interconnection networks they use for 
high shared-memory bandwidth. These attributes contribute 
to increased programmability and the realization of a high 
fraction of peak performance on user applications. Realized 
performance on MPP machines is still uncertain. A com- 
parison of today's vector machines versus MPP systems 
based on realized performance per dollar reveals much less 
difference in cost-performance than comparisons based on 
peak performance. 



Kendall Square, Thinking Machines Inc., MasPar, 
and nCUBE. 

Microprocessor performance has increased by 4X 
every three years, matching the rate of integrated 
circuit logic density improvement as predicted by 
Moore's law. For example, the microprocessors of 
1993 are around 200 times faster than those of 
1 98 1 . By contrast, the clock rates of vector 
processors have improved much more slowly; 
today's fastest vector processors are only five or 
six times faster than 1976's Cray 1. Thus, the per- 
formance gap between these two technologies is 
quickly disappearing in spite of other performance 
improvements in vector processor architecture. 

Although microprocessor-based massively parallel 
systems hold considerable promise for the future, 
they have not yet reached maturity in terms of 
ease of programming and ability to deliver high 
performance routinely to large classes of applica- 
tions. Unfortunately, the programming technology 
that has evolved for the vector multiprocessors 
does not directly transfer to highly parallel sys- 
tems. New mechanisms must be devised for high 
performance communication and coordination 
among the processors. These mechanisms must be 
efficiently supported in the hardware and effective- 
ly embodied in programming models. 

Currently, vendors are providing a variety of sys- 
tems based on different approaches, each of which 
has the potential to evolve into the method of 
choice. Vector multiprocessors support a simple 
shared memory model which demands no par- 
ticular attention to data arrangement in memory. 
Many of the currently available highly parallel ar- 
chitectures are based on the "multicomputer" ar- 
chitecture which provides only a message-passing 
interface for inter-processor communication. 
Emerging architectures, including the Kendall 
Square KSR-1 and systems being developed by 
Convex, Cray Research, and Silicon Graphics, 
have shared address spaces with varying degrees 
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of hardware support and different refinements of 
the shared memory programming model. These 
computers represent a compromise in that they 
offer much of the programming simplicity of 
shared memory yet still (at least so far) require 
careful data arrangement to achieve good perfor- 
mance. (The data parallel language on the CM-5 
has similar properties.) A true shared memory 
parallel architecture, based on mechanisms that 
hide memory access latency, is under development 
atTera Computer. 

The size of the high performance computer market 
worldwide is about $2 billion (excluding sales of 
the IBM add-on vector hardware), with Cray Re- 
search accounting for roughly $800 million of it. 
IBM and Fujitsu are also significant contributors 
to this total, but most companies engaged in this 
business have sales of $100 million or less. Some 
companies engaged in high speed computing have 
other, larger sources of revenue (IBM, Fujitsu, 
Intel, NEC, Hitachi); other companies both large 
(Cray Research) and small (Thinking Machines, 
Kendall Square, Meiko, Tera Computer) are high 
performance computer manufacturers exclusively. 
There are certainly more companies in the busi- 
ness than can possibly be successful, and no doubt 
there are new competitors that will appear. Help- 
ing to sustain this high level of competitive in- 
novation should be an important objective for 
NSB policy in HPC. 

FINDINGS 

Where is the hardware going to be in 5 years? 
What will be the performance and cost of the 
most powerful machines, the best workstations, 
the mid-range computers? 

The next five years will continue to see improve- 
ments in hardware price/performance ratios. 
Since microprocessor speeds now closely ap- 
proach those of vector processors, it is unclear 
whether microprocessor performance improve- 
ment can maintain its current pace. Still, as long as 
integrated circuits continue to quadruple in density 
but only double in cost every three years we can 
probably expect a fourfold price/performance im- 
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provement in both processors and memory by 
1998. Estimating in constant 1993 dollars, the 
most powerful machines ($50 million) will have 
peak performance of nearly a teraflop 21 ; mini-su- 
percomputers ($1 million) will advertise 20 
gigaflops peak performance; workstations 
($50,000) will approach 1 gigaflops, and personal 
computers ($10,000) will approach 200 
megaflops. 22 

During this period, parallel architectures will con- 
tinue to emerge and evolve. Just as the CM-5 rep- 
resented a convergence between SIMD and 
MIMD parallel architectures and brought about a 
generalization of the data-parallel . programming 
model, it is likely that the architectures will con- 
tinue to converge and better user-level program- 
ming models will continue to emerge. These 
developments will improve software portability 
and reduce the variety of architectures that are re- 
quired for computational science and engineering 
research, although there will likely still be some 
diversity of approaches at the end of this 5-year 
horizon. Questions that may be resolved by 1998 
include: 

• Which varieties of shared memory architec- 
ture provide the most effective tradeoff be- 
tween hardware simplicity, system 
performance, and programming con- 
venience? and 

• What special synchronization mechanisms 
for processor coordination should be sup- 
ported in the hardware? 

Most current systems are evolving in these direc- 
tions, and answers to the issues will provide a 
more stable base for software efforts. Further- 
more, much of the current computer science re- 
search in shared memory architectures is looking 
for cost-effective hardware support that can be im- 



2I One teraflop is 1000 gigaflops or 10 12 floating point in- 
structions per second. 

22 Spokesman from Intel, Convex and Silicon Graphics in ad- 
dressing the panel all made even higher estimates than this. 



plemented in multiprocessor workstations that are 
interconnected by general-purpose local area net- 
works. Thus, technology from high performance 
parallel systems may be expected to migrate to 
workstation networks, further improving the 
capabilities of the v systems to deliver high-perfor- 
mance computing to particular applications. It is 
possible that in the end the only substantial dif- 
ference between the supercomputers of tomorrow 
and the workstation networks of tomorrow will be 
the installed network bandwidth. 23 

Where is the software/programmability going 
to be in 5 years? What new programming 
models will emerge for the new technology? 
How transparent will parallel computers be to 
users? 

While the architectural issues are being resolved, 
parallel languages and their compilers will need to 
continue to improve the programmability of new 
high performance computer systems. Implementa- 
tions of "data parallel" language dialects like High 
Performance Fortran, Fortran D, and High Perfor- 
mance C will steadily improve in quality over the 
next five years and will simplify programming of 
both multi-computers and shared address systems 
for many applications. For the applications that 
are not helped by these languages, new languages 
and programming models will emerge, although at 
a slower pace. Despite strong efforts addressing 
the problem from the language research com- 
munity, the general purpose parallel programming 
language is an elusive and difficult quarry, espe- 
cially if the existing Fortran software base must be 
accommodated, because of difficulties with the 
correct and efficient use of shared variables. 

Support tools for software development have also 
been making progress, with emphasis on visualiza- 
tion of a program's communication and 



2? While parallel architectures mature, vector multiprocessors 
will continue to evolve. Scaling to larger numbers of proces- 
sors ultimately involves solving the same issues as for the 
microprocessor-based systems. 



synchronization behavior. Vendors are increasing- 
ly recognizing the need for sophisticated perfor- 
mance tuning tools, with most now developing or 
beginning to develop such tools for their 
machines. The increasing number of computer 
scientists who are also using these tools could lead 
to even more rapid improvement in the quality and 
usability of these support tools. Operating sys- 
tems for high performance computers are increas- 
ingly ill suited to the demands placed on them. 
Virtualization of processors and memory often 
leads to poor performance, whereas relatively 
fixed resource partitioning produces inefficiency, 
especially when parallelism within the application 
varies. High performance I/O is another area of 
shortfall in many systems, especially the multi- 
computers. Research is needed in nearly every 
aspect of operating systems for highly parallel 
computers. 

What market forces or technology investments 
drive HPC technologies and products? 

Future high performance systems will continue to 
be built using technologies and components built 
for the rest of the computer industry. Since in- 
tegrated circuit fabrication facilities now represent 
billion dollar capital investments, integrated cir- 
cuits benefit from very large scale economies; ac- 
cordingly it has been predicted that only 
mass-market microprocessors will prove to have 
acceptable costs in future high performance sys- 
tems. Certainly current use of workstation 
microprocessors such as Sparc, Alpha and the RS- 
6000 chips suggests this trend. Even so the cost 
of memory chips is likely to be a major factor in 
the costs of massively parallel systems, which re- 
quire massive amounts of fast memory. Thus the 
integrated circuit technology available for both 
custom designs and industry standard processors 
will increasingly be driven by the requirements of 
much larger markets, including consumer 
electronics. 

The health of the HPC vendors and the structure 
of their products will be heavily influenced by 
demand from industrial customers. Business ap- 
plications represents the most rapidly growing 
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market for HPC products; they have much higher 
potential growth than government or academic 
uses. Quite apart from NSF's obligation to con- 
tribute to the nation's economic health through its 
research activities, this fact motivates the impor- 
tance of cooperation with industry users in expand- 
ing HPC usage. This reality means that NSF 
should be attentive to the value of throughput as a 
figure of merit in HPC systems (in contrast with 
turnaround time which academic researchers usual- 
ly favor), as well as the speed with which large 
volumes of data can be accessed. Industry won't 
put up with a stand-alone, idiosyncratic environ- 
ment. 

How practical will be the loose coupling of desk- 
top workstations to aggregate their unused com- 
pute power? 

Networks of workstations will become an impor- 
tant resource for the many computations that per- 
form well on them. The probable success of these 
loosely coupled system will inevitably raise the 
standard for communication capabilities in the 
multicomputer arena. Many observers believe 
that competition from workstation networks on 
one side and shared address space systems on the 
other will drive multi-computers from the scene 
entirely; in any event, the network bandwidth and 
latency of multi-computers must improve to dif- 
ferentiate them from workstation networks. Many 
large institutions have 1000 or more workstations 
already installed; the utilization rate of their 
processors on a 24 hour basis is probably only a 
few percent. An efficient way to use the power of 
such heterogeneous networks would be more 
financially attractive. It will, however, raise 
serious question about security, control, virus- 
prevention, and accounting programs. 

Are there some emerging HPC technologies of 
interest other than parallel processing? What is 
their significance? 

Neural networks have recently become popular 
and have been successfully applied to many pat- 
tern recognition and classification problems. 
Fuzzy logic has enjoyed an analogous renaissance. 



Technologies of this sort are both interesting and 
important in a broad engineering context and also 
are having impact on computational science and 
engineering. Machine learning approaches, such 
as neural networks, are most appropriate in scien- 
tific disciplines where there is insufficient theory 
to support accurate computer modeling and 
simulation. 

How important are simulation and visualiza- 
tion capabilities? 

Simulation will play an ever increasing role in 
science and engineering. Much of this work will 
be able to be carried out on workstations or inter- 
mediate-scale systems, but it will continue to be 
appropriate to share the highest performance sys- 
tems (and the expertise in using them) on a nation- 
al scale, to accomplish large simulations within 
human time scales. Smaller configurations of 
these machines should be provided to individual 
research universities for application software 
development and research that involves modifying 
the operating system and/or hardware. 

Personal computer capabilities will improve, and 
visualization on the desktop will become more 
routine. Scientists and engineers in increasing 
numbers will need to be equipped with visualiza- 
tion capabilities. The usefulness of high perfor- 
mance computing relies on these systems because 
printed lists of numbers (or printed sheaves of pic- 
tures, for that matter) are increasingly unsatisfac- 
tory as an output medium, even for moderately 
sized simulations. 

BARRIERS TO CONTINUED 
RAPID PROGRESS 

What software and/or hardware "inventions" 
are needed? Who will address meeting these 
needs? 

The most important impediment to the use of new 
highly parallel systems has been the difficulty of 
programming these machines and the wide varia- 
tion that exists in communication capabilities 
across generations of machines as well as among 
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the machines in a given generation. Application 
software developers are understandably reluctant 
to re-implement their large scale production codes 
on multi-computers, when significant effort is re- 
quired to port the codes across parallel systems as 
they evolve. In theory, any programming model 
can be implemented (by appropriate compilers) on 
any machine. However, the inefficiency of certain 
models on certain architectures is so great as to 
render them impractical. 24 What is needed in high 
performance computing is an architectural consen- 
sus and a simple model to summarize and abstract 
the machine interface to allow compilers to be 
ported more easily across systems, facilitating the 
portability of application programs. Ideally, the 
consensus interface should efficiently support ex- 
isting programming models (even the multi-com- 
puters have created their own dusty decks), as well 
as more powerful models. Considerable research 
in the computer science community is currently 
devoted to these issues. It is unlikely that the diver- 
sity of programming models will decrease within 
the next five years, but it is likely that models will 
become more portable. 

How important will be access to data, data 
management? 

Besides needing high performance I/O, some 
fields of computational science need widely dis- 
tributed access to data bases that are extremely 
large and constantly growing. The need is par- 
ticularly felt in the earth and planetary sciences, al- 
though the requirements are also great in cellular 
biology, high energy physics, and other dis- 
ciplines. Large scale storage hierarchies and the 
software to manage them must be developed, and 
means to distribute the data nationally and interna- 
tionally are also required. Although this area of 
high performance computing has been relatively 
neglected in the past, these problems are now 
receiving significantly more attention. 



z Tor example, it is not practical to implement data-parallel 
compilers on the Intel iPSC/860. 
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ROLES FOR GOVERNMENT AGENCIES 

What should government agencies (NSF, DoD, 
DoE) do to advance HPC beyond today's state 
of the art? Whai more might they be doing? 

The National Science Foundation plays several 
critical roles in advancing high performance com- 
puting. First, NSF's support of basic research and 
human resources in all areas of science and en- 
gineering (and particularly in mathematics, com- 
puter science and engineering) has been 
responsible for many of the advances in our ability 
to successfully tackle grand challenge problems. 
The Supercomputer Centers and the NSFnet have 
been essential to the growth of high performance 
computing as a basic paradigm in science and en- 
gineering. These efforts have been successful and 
should be continued. 

However, NSF has done too little in supporting 
computational engineering in the computer 
science community. For example, the NSF Super- 
computer Centers were slow in providing ex- 
perimental parallel computing facilities and are 
currently not responding adequately to integrating 
emerging technologies from the computer science 
community. Although this situation is gradually 
changing, the pace of the change should be ac- 
celerated. 

Many advances in high performance computer sys- 
tems have been funded and encouraged by the Ad- 
vanced Research Projects Agency (ARPA), the 
major supporter of large scale projects in com- 
puter science and engineering research and 
development in the US. ARPA has been charged 
by Congress to champion "dual use" technology; 
in so doing it is addressing many of the needs of 
computational science and engineering, even in 
the mathematical software arena, that are common 
to defense and commercial applications, and the 
science that underlies both. 

The Department of Energy has traditionally 
provided substantial support to computer science 
and engineering research within its national 
laboratories and at universities with strong im- 
petus being provided by national defense require- 
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merits and resources. More recently, the focus has 
shifted to the high performance computing and 
communications needs of the unclassified Energy 
Research programs within DoE. The National 
Energy Research Supercomputer Center (NERSC) 
and the Energy Sciences Network (ESnet) provide 
production services similar to the NSF supercom- 
puter centers and the NSFnet. Under the DoE 
HPCC component, "grand challenge" applications 
are supported at NERSC and also at two High Per- 
formance Computing Research Centers (HPCRCs) 
which offer selected access for grand challenge ap- 
plications to leading edge parallel computing 
machines. DoE also sponsors a variety of graduate 
fellowships in the computational sciences. The 
computational science infrastructure and traditions 
of DoE remain sound; however, the ability of the 
Department to advance the state-of-the-art in high 
performance computing systems will be paced by 
its share of the funding available through the 
Federal High Performance Computing Initiative or 
through Defense conversion funds. 

The Department of Commerce has not been a sig- 
nificant source of funds for computer system re- 
search and development since the very early days 
of the computer industry when the National 
Bureau of Standards built one of the first digital 
computers. NBS has been an important factor in 
supporting standards development, particularly for 
the Federal Information Processing Standards is- 
sued by GSA. The expanded role of the National 
Institute for Standards and Technology (as NBS is 
now called) under the Clinton administration may 
include this kind of activity, especially when in- 
dustrial participation is a desired component. 

NASA is embarked on a number of projects of 
potential importance, especially in the develop- 
ment of a shared data system for the global 
climate change program, which will generate mas- 
sive amounts of data from the Earth Observing 
Satellite Program. 

What is role of NSF computer science and ap- 
plied mathematics research program? Is it 
relevant to the availability of HPC resources in 
a five year time span? 



Investments in mathematics and computer science 
research provide the foundation for attacking 
today's problems in high performance computing 
and must continue. NSF continues to be the 
primary U.S. source of funds for mathematics and 
computer science research within the scope of 
what one or two investigators and several graduate 
assistants can do. Many fundamental advances in 
algorithms, programming languages, operating 
systems, and computer architecture have been 
NSF funded. This mission has been just as vital as 
ARPA's and is complementary to it. 

Among the largest barriers to the effective use of 
emergent computing technologies are parallel ar- 
chitectures from which it is relatively easy to ex- 
tract peak performance, system software 
(operating systems, databases, compilers, program- 
ming models) to take advantage of these architec- 
tures, parallel algorithms, mathematical modeling, 
and efficient and high order numerical techniques. 
These are core computational mathematics and 
computer science/engineering research issues, 
many of which are best tackled through NSF's 
traditional peer-reviewed model. NSF should in- 
crease its support of this work. Increased invest- 
ment in basic research and human resources in 
mathematics and computer science/engineering 
could significantly accelerate the pace of HPC 
technology development. In addition, technology 
transfer can be increased by supporting a new 
scale of research not currently being funded by 
any agency: small teams with annual budgets in 
the $250K - $1 M range. These projects were once 
supported by DoE, and also indirectly by ARPA at 
the former "block grant" universities. 25 These 



25 ARPA made an enormous contribution to the maturing of 
computer science as a discipline in U.S. universities by con- 
sistently funding research at about $1 million per year at 
MIT, Carnegie Mellon, Stanford University and Berkeley. At 
this level of consistent support these universities could build 
up a critical mass of faculty and trained the next generation 
of faculty leadership for departments beinp set up at every 
substantial research university. This targeted investment 
played a role in computer science not unlike what NSF has 
done in computational science at the four Centers. 



enterprises are now generally too small for ARPA 
and seem to be too big for current NSF budget 
levels in computer science. A project of this scale 
could develop and release an innovative piece of 
software to the high performance computing com- 
munity at large, or build a modest hardware 
prototype as a stepping stone to more significant 
funding. A project of this scale would also allow 



multi-disciplinary collaboration, either within 
mathematics and computer science/engineering 
(architecture, operating systems, compilers, algo- 
rithms), and related disciplines (astronomers or 
chemists working with computer scientists and 
mathematicians interested in innovative program- 
ming or architectural support for that problem 
domain). 
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Supercomputer Usage by State: Fiscal Year 1992 
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Figure 2 

Trends In user status at four National Supercomputer Centers 
(Data from use of vector supercomputers only) 





42 

34 



Figure 3 
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Figure 4 

Trend In NSF Supercomputer Center Leverage and 
Growth in number of users 
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Appendix E 



REVIEW AND PROSPECTUS OF COMPUTATIONAL 
AND COMPUTER SCIENCE AND ENGINEERING 

Personal Statements by Panel Members 

Computational Mechanics and Structural Analysis 
by Theodore Belytschko 



High performance computing has had a dramatic 
impact on structural analysis and computational 
mechanics, with significant benefits for various in- 
dustries. The finite element method, which was 
developed at aerospace companies such as Boeing 
in the late 1950's and subsequently at the univer- 
sities, has become one of the key tools for the 
mechanical design of almost all industrial 
products, including aircraft, automobiles, power 
plants, packaging, etc. The original applications of 
finite element methods were primarily in linear 
analysis, which are useful for determining the be- 
havior of engineering products in normal operat- 
ing conditions. Most linear finite element analyses 
are today performed on workstations , except for 
problems with the order of 1 million unknowns. 

Supercomputers are used primarily for nonlinear 
analysis, whers they replace prototype testing. 
One rapidly developing area has been automobile 
crash worthiness analysis, where models of 
automobiles are used to design for occupant safety 
and for features such as accelerometer placement 
for a air bag deployment. The models which are 
currently used are generally on the order of 
100,000 to 250,000 unknowns, and even on the 
latest supercomputers such as the CRAY C90 re- 
quire on the order of 10 to 20 hours of computer 
time. Nevertheless these models are still often too 
coarse to permit true prediction and hence they 
must be tuned by tests. 

Such models have had a tremendous impact on 
reducing the design time for automobiles, since 
they eliminate the need for building numerous 
prototypes. Almost all major automobiles manufac- 



turers have undertaken extensive programs in 
crashworthiness simulations by computer on high 
performance machines, and many manufacturers 
have bought supercomputers almost expressly for 
crashworthiness simulation. 

Because of the increasing concern with safety 
among manufacturers of many other products, non- 
linear analysis are also emerging in many other in- 
dustries: the manufacturer of trucks and 
construction equipment, where the product must 
be certified for safety in various accidents such as 
overturning or impact due to falling construction 
equipment; railroad car safety; the safety of 
aircraft, where recently the FAA have undertaken 
programs to simulate the response of aircraft to 
small weapons so that damage from such ex- 
plosives can be minimized. Techniques of this 
type are also being used the analysis the safety of 
jet engines due to bird impact, the containment of 
fragments in case of jet engine failure, and bird im- 
pact on aircraft canopies. In several cases, NSF Su- 
percomputer Centers have introduced industry to 
the potentials of this type of simulation. In all of 
these, highly nonlinear analysis which require on 
the order of lOx floating point operations for a 
simulation must be made: such simulations even 
on today's supercomputers are still often so time 
consuming that decisions cannot be reached fast 
enough. Therefore an urgent need exists for in- 
creasing the speed with which such simulations 
can be made. 

Nonlinear finite element analysis is also becoming 
increasingly important in the simulation manufac- 
turing processes. For example, tremendous im- 
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provements can be made in processes such as 
sheet metal forming, extrusion, and machining 
processes if these are carefully designed through 
nonlinear finite element simulation. These simula- 
tions offer large cost reductions and reduce design 
time. Also the design of materials can be im- 
proved if computers are first used to examine how 
these materials fail and then to design the material 
so that failure is either decreased or so that the 
material fails in a less catastrophic manner. Such 
simulations require great resolution, and at the tips 
of cracks phenomenon at the atomic scale must be 
considered. 



Most of the calculations mentioned above are not 
made with sufficient resolution because of limita- 
tions in computational power and speed. Also, im- 
portant physical phenomena are omitted for 
reasons of expediency, and their computational 
modeling is not well understood. Therefore, the 
availability of more computational power will in- 
crease our understanding of modeling nonlinear 
structural response and provide industry with 
more effective tools for design. 



Cellular and Systemic Biology 
by Teresa Chay 



HPC has made a great impact on a variety of 
biological disciplines, such as physiology, biologi- 
cal macromolecules, and genetics. I will discuss 
below three vital organs in our body where our un- 
derstanding has greatly benefitted from high-per- 
formance comuting and will continue to do so in 
the future. 

Computer Models For Vital Organs In Our Body 

Although the heart, brain, and pancreas function 
differently in our body (i.e., the heart circulates 
the blood, the brain stores and transfers informa- 
tion, and the pancreas secretes vital hormones 
such as insulin), the mechanisms underlying their 
functioning are quite similar - "excitable" cells 
that are coupled electrically and chemically, form- 
ing a network. 

Ion channels in the cell membranes are involved 
in information transfer. The ion channels receive 
stimuli from neighboring cells and from cells in 
other organs. Upon receiving stimuli some ion 
channels open while others close. When these 
channels are open, they pass ions into or out of the 
cells, creating an electrical difference (membrane 
potential) between the outside and the inside of 
the cell. Some of these ion channels are sensitive 
to the voltage (i.e., membrane potential) and 



others are responsive to chemical substances (e.g., 
neurotransmitters/ hormones). 

Opening of ion channels creates the "action poten- 
tial" which spreads from cell to cell, either direct- 
ly or via chemical mediators. Electrical 
transmission and chemical transmission are inter- 
dependent in that chemical substances can in- 
fluence the ionic currents and visa versa. For 
example, the arrival of the action potential at a 
presynaptic terminal may cause a release of chemi- 
cal substances; in turn these chemicals can 
open/close the ion channels in the postsynaptic 
cell. 

Why is high-performance computing needed? 
How the signals are passed from cell to cell is a 
nonlinear dynamical problem and can be treated 
mathematically by solving simultaneous differen- 
tial equations. These equations involve voltage, 
conductance of the ionic current, and concentra- 
tion of those chemical substances that influence 
conductances. Depending on the model, each net- 
work can be represented by a set of several mil- 
lion differential equations. The need for parallel 
processors is obvious — the organs process infor- 
mation just the same way as the most powerful 
parallel supercomputers do. Since the mechanisms 
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involved in these three organs are essentially the 
same, algorithms developed for one can be easily 
modified to solve for another. 

Three specific areas in which high-performance 
computing is central are cardiac research, neural 
networks, and insulin secretion. These are 
detailed below. 

Cardiac Research 

It would be a great benefit to cardiac research if a 
realistic computer model of the heart, its valves 
and the nearby major vessels were to become 
available. With such a model the general public 
would be able to see how the heart generates its 
rhythm, how this rhythm leads to contraction, and 
how the contraction leads to blood circulation. 
Scientists, on the other hand, could study normal 
and diseased heart function without the limitations 
of using human and animal subjects. With future 
HPC and parallel processing, it may be possible to 
build a model heart without consuming too many 
hours of computer time. As a step toward achiev- 
ing this goal, the scientists in computational car- 
diology have thus far accomplished the following 
three objectives. 

1 . A computer model of blood flow in the 
heart: Researchers have used supercom- 
puters to design an artificial mitral valve 
(the valve that controls blood flow between 
the left atrium and left ventricle) which is 
less likely to induce clots. The computer 
simulated mitral valve has been patented 
and licensed to a corporation developing it 
for clinical use. With parallel processors, 
this technique is now expanded in order to 
construct a realistic three dimensional heart. 

2. Constructing an accurate map of the electri- 
cal potential of the heart surface (epicar- 
diunt): Arrhythmia in the heart is caused by 
a breakdown in the normal pattern of car- 
diac electrical activity. Many arrhythmias 
occur because of an abnormal tissue inside 
the heart. Bioengineers have been develop- 
ing a technique with which to obtain the 
epicardial potential map from the coarse in- 
formation of it that can be recorded on the 
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surface of the body via electrocardiogram. 
With such a map, clinicians can accurately 
locate the problem tissue and remove it with 
a relatively simple surgical procedure in- 
stead of with drastic open-heart surgery. 

3. Controlling sudden cardiac death: Sudden 
cardiac death is triggered by an extra heart 
beat. Such a beat is believed to initiate spiral 
waves (i.e., reentrant arrhythmias) on the 
main pumping muscle of the heart known as 
the ventricular myocardium. With HPC it is 
possible to simulate how this part of the 
heart can generate reentrant arrhythmia 
upon receiving a premature pulse. Computer 
modelling of reentrant arrhythmia is very 
important clinically since it can be used as a 
tool to predict the onset of this type of dead- 
ly arrhythmia and find a means to cure it by 
properly administrating antiarrhythmia 
drugs (instead of actually carrying out ex- 
periments on animals). 

Parallel computing and development of better 
software will soon enable the researchers to ex- 
tend their simulations to a more realistic three- 
dimensional system which includes the detailed 
geometry of ventrical muscle. 

Neural Networks 

Learning how the brain works is a grand challenge 
in science and engineering. Artificial neural nets 
are based largely on their connection patterns, but 
they have very simple processing elements or 
nodes (either ON or OFF). That is, a simple net- 
work consists of a layered network with an input 
layer (sensory receptors), one or more "hidden" 
layers (representing the interneurons which allow 
animals to have complex behaviors), and an out- 
put layer (motor neurons). Each unit in a neural 
net receives inputs, both excitatory and inhibitory, 
from a number of other units and, if the strength 
of the signal exceeds a given threshold, the "on- 
unit" sends signals to other units. 

The real nervous system, however, is a complex 
organ that cannot be viewed simply as an artificial 
neural net. Neural nets are not hard-wired but are 
made of neurons which are connected by synap- 



ses. There are at least 10 billion active neurons in 
the brain. There are thousands of synapses per 
neuron, and hundreds of active chemicals which 
can modify the properties of ion channels in the 
membrane. 

With HPC and massive parallel computing, 
neuroscientists are moving into a new phase of in- 
vestigation which focuses on biological neural 
nets, incorporating features of real neurons and the 
connectivity of real neural nets. Some of these 
models are capable of simulating patterns of 
electrical activity, which can be compared to ac- 
tual neuronal activity observed in experiments. 
With the biological neural nets, we begin to under- 
stand the operation of the nervous system in terms 
of the structure, function and synaptic connec- 
tivity of the individual neurons. 

Insulin Secretion 

Insulin is secreted from the beta cells in the 
pancreas. To cure diabetes it is essential to under- 
stand how beta cells release insulin. The beta cells 
are located in a part of pancreas known as the islet 



of Langerhans. In islets, beta cells are coupled by 
a special channel (gap junctional channel) which 
connects one cell to the next Gap junctional chan- 
nels allow small ions such as calcium ions pass 
through from cell to cell. In the plasma of beta 
cells, there are ion channels whose properties 
change when the content of calcium ions changes. 
There are other types of cells in the islet which 
secrete hormones. These hormones in turn in- 
fluence insulin secretion by altering the properties 
of the receptors bound in the membrane of a beta 
cell. Thus the study on how beta cells release in- 
sulin involves very complex non-linear dynamics. 
With a supercomputer it is possible to construct a 
model of the islet of Langerhans. With this model, 
researchers would learn how beta cells release in- 
sulin in response to the external signals such as 
glucose, neurotransmitters and hormones. They 
would also learn the roles of other cell types in the 
islet of Langerhans and how they influence the 
functional properties of beta cells. A model in 
which beta cells function as a cluster has been al- 
ready constructed. 



Material Science and Condensed Matter Physics 
by Gary S. Grest 



The impact of high performance computing on 
material science and condensed matter physics has 
been enormous. Major developments in the sixties 
and seventies set the stage for the establishment of 
computational material science as a third dis- 
cipline, equal, yet distinct from analytic theory 
and experiment These developments include the 
introduction of molecular dynamics and Monte 
Carlo methods to simulate the properties of liquids 
and solids under a variety of conditions. Density 
functional theory was developed to model the 
electron-electron interactions and pseudopotential 
methods to model the electron-ion interactions. 
These methods were crucial in computing the 
electronic structure for a wide variety of solids. 
Later, the development of path integral and 



Green's function Monte Carlo methods allowed 
one to begin to simulate quantum many-body 
problems. Quantum molecular dynamics which 
combine well-established electronic methods 
based on local density theory with molecular 
dynamics for atoms have recently been intro- 
duced. On a more macroscopic scale, computation- 
al mechanics which was discussed above by T. 
Belyschko was developed to study structural 
properties relations. 

Current usage of high performance computing in 
material science and condensed matter physics can 
be broadly classified as Classical Many-Body and 
Statistical Mechanics, Electronic Structure and 
Quantum Molecular Dynamics , and Quantum 
Many-Body, which are discussed below. 
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Classical Many-Body and Statistical Mechanics 

Classical statistical mechanics, where one treats a 
huge number of atoms collectively date back 

to Boltzmann and Gibbs. In these systems, quan- 
tum mechanics plays only a subsidiary role. While 
it is needed to determine the interaction between 
atoms, in practice these interactions are often 
replaced by phenomenologically determined pair- 
wise forces between the atoms. This allows one to 
treat large ensembles of atoms, by molecular 
dynamics and Monte Carlo methods. Successes of 
this approach include insight into the properties of 
liquids, phase transitions and critical phenomena, 
crystallization of solids and compositional order- 
ing in alloys. For systems where one needs a quan- 
titative comparison to experiment, embedded atom 
methods have been developed in which empirical- 
ly determined functions are employed to evaluate 
the energy and forces. Although the details of the 
electronic structure are lost, these empirical 
methods have been successful in giving 
reasonable descriptions of the physical processes 
in many systems in which directional bonding is 
not important. Theoretical work in the mid-70's on 
renormalization group methods, showed that a 
wide variety of different kinds of phase transitions 
could be classified according to the symmetry of 
the order parameter and the range of the interac- 
tion and did not depend on the details of the inter- 
action potential. This allowed one to use relatively 
simply models, usually on a lattice, to study criti- 
cal phenomena and phase behavior. 

While the basic computational techniques used in 
classical many-body theory are now well estab- 
lished, there remain a large number of important 
problems in material science which only be ad- 
dressed with these techniques. At present, with 
Cray YMP class computers, one can typically 
handle thousands of atoms for hundreds of 
picoseconds. With the next generator of massively 
parallel machines, this can be extended to millions 
of particles for microseconds. While not all 
problems require this large number of particles or 
long times, many do. Problems which will benefit 
from the faster computational speed typically in- 
volve either large lengths and/or long time scales. 
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Examples include polymers and macromolecular 
liquids, where typical lengths scales of each 
molecule can be hundreds angstroms and relaxa- 
tion time scales extend from microseconds and 
longer, liquids near their glass transition where 
relaxation times diverge exponentially, nucleation 
and phase separation which requires both large sys- 
tems and long times and effects of shear. Macro- 
molecular liquids typically contain objects of very 
different sizes. For example, in most co r oidal 
suspensions, the colloid particles are hundreds of 
angstroms in size while die solvent is only a few 
angstroms. At present, the solvent molecules must 
t o treated as a continuum background. While this 
allows one to study the static properties of the sys- 
tem, the dynamics are incorrect. Faster computer 
will allow us to study flocculation, sedimentation 
and the effects of shear on order. Non-equilibrium 
molecular dynamics methods have been developed 
to simulate particles under shear. However due to 
the lack of adequate computation power, simula- 
tions at present can only be carried out at unphysi- 
cally high shear rates. Access to HPC will enable 
one to understand the origins of shear thinning and 
thickening in a variety of technologically impor- 
tant systems. While molecular dynamics simula- 
tions are inherently difficult to vectorize, recent 
efforts to run them on parallel computers have 
been very encouraging, with increases in speed of 
nearly a factor of 30 in comparison to the Cray 
YMP. 

Monte Carlo simulations on a lattice remain a very 
powerful computational technique. Simulations of 
this type have been very successful in under- 
standing critical phenomena, phase separation, 
growth kinetics and disordered magnetic systems. 
Successes include accurate determination of 
universal critical exponents, both static and 
dynamic, and evidence for the existence of a phase 
transition in spin glasses. Future work using mas- 
sively parallel computers will be essential to under- 
stand wetting and surface critical exponents as 
well as systems with complex order parameters. 
Direct numerical integration of a set of Langevin 
equations that describe the nonlinear fluctuating 
hydrodynamics can be solved in two-dimensions 
on a Cray YMP class supercomputer but ihe exten- 




sion to three dimensions requires HPC. Finally, 
cell automata solutions of Navier-Stokes and 
Boltzmann equations are a powerful method for 
studying hydrodynamics. All of these methods, be- 
cause of their inherent locality, run very efficiently 
on parallel computers. 

Electronic Structure and Quantum Molecular 
Dynamics 

The ability of quantum mechanics to predict the 
total energy of a system of electrons and nuclei 
enables ones to reap tremendous benefits from 
quantum-mechanical calculations. Since many 
physical properties can be related to the total ener- 
gy of a system or to differences in total energy, 
tremendous theoretical effort has gone into 
developing accurate local density functional total 
energy techniques. These methods have been very 
successful in predicting with accuracy equilibrium 
constants, bulk moduli, phonons, piezoelectric 
constants and phase-transition pressures and 
temperatures for a variety of materials. These 
methods have recently been applied to study the 
structural, vibrational, mechanical and other 
ground state properties of systems containing up 
to several hundred atoms. Some recent successes 
include the unraveling of the normal state proper- 
ties of high T_c superconducting oxides, predic- 
tions of new phases of materials under high 
pressure, predictions of superhard materials, deter- 
mination of the structure and properties of sur- 
faces, interfaces and clusters, and calculations of 
properties of fullerenes and fullerites. 

Particularly important are the developments of the 
past few years which make it possible to carry out 
"first principles" computations of complex atomic 
arrangements in materials starting from nothing 
more than the identities of the atoms and the rules 
of quantum mechanics. Recent developments in 
new iterative diagonalization algorithms coupled 
with increases in the computational efficiency of 
modern high performance computers have made it 
possible to use quantum mechanical calculations 
of the dynamics of systems in the solid, liquid and 
gaseous state. The basic idea of these methods 
which are known as ab initio methods is to mini- 
mize the total energy of the system by allowing 
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both the electronic and the ionic degrees of 
freedom to relax towards equilibrium simul- 
taneously. While ab initio methods have been 
around for more than a decade, only recently have 
they been applied to systems of more than a few 
atoms. Now, however, this method can be used to 
model a few hui tired atom system and this num- 
ber will increase by at least a factor of 10 within 
the next five years. The method has already lead 
to new insights into the structure of amorphous 
materials, finite temperature simulations of the 
new C60 solid, computation of the atomic and 
electronic structures of 7x7 reconstruction of 
Si(l 1 1) surface, melting of carbon and studies of 
step geometries on semiconductor surfaces. In the 
future, it will be possible to address many impor- 
tant materials phenomena including phase transfor- 
mations, grain boundaries, dislocations, disorder 
and melting. 

The problem of understanding and improving the 
methods of growth of complicated materials, such 
as multi-component heterostructures which are 
produced by epitaxial growth using molecular 
beam or chemical vapor deposition techniques, 
stands out as one very important technological ap- 
plication of this method. Although a "brute-force" 
simulation of atomic deposition on experimental 
time scales will not possible for sometime, one 
can learn a great deal from studying the mech- 
anisms of reactive film growth. Combining 
atomic calculations for the structure of an inter- 
face with continuum theories of elasticity and plas- 
tic deformation is also an important area for the 
future. 

One of the most obvious areas for future applica- 
tions are biological systems, where key reaction se- 
quences would be simulated in ab initio fashion. 
These calculations would not replace existing 
molecular mechanics approaches, but rather sup- 
plement them in those areas where they are not 
sufficiently reliable. This includes enzymatic reac- 
tions involving transition metal centers and other 
multi-center bond-reforming processes. A related 
area is catalysis, where the various proposed reac- 
tion mechanisms could be explicitly evaluated. 
Short-time finite temperature simulations can also 
be explored to search for unforeseen reaction pat- 



terns. The potential for new discoveries in these 
areas is high. 

Important progress has also been made in under- 
standing the excitation properties of solids, in par- 
ticular the predictions of band offsets and optical 
properties. This requires the evaluation of the 
electron self-energy and is computationally much 
heavier than the local density approaches dis- 
cussed above. This first principles quasiparticle ap- 
proach has allowed for the first time the ab initio 
calculation of electron excitation energies in solids 
valid for quantitative interpretation of spectro- 
scopic measurements. The excitation of systems as 
complex as C_60 fullerites have been computed. 
Although the quasiparticle calculations have yet to 
be implemented on massively parallel machines, it 
is doable and the gain in efficiency and power is 
expected to be similar to the ab initio molecular 
dynamics types of calculations. 

Much effort has been devoted in the past several 
years to algorithm development to extend the ap- 
plicability of these new methods to ever larger sys- 
tems. The ab initio molecular dynamics have been 
successfully implemented on massively parallel 
machines for systems as large as 700 atoms. Tight 
binding molecular dynamics methods are an ac- 
curate, empirical way to include the electronic 
degrees of freedom which are important for 
covalently bonded materials, at speeds of 300 time 
faster than ab initio methods. This method has al- 
ready been used to simulate 2000 atoms and with 
the new massively parallel machines, this number 
will easily increase to 10,000 within a year. 
Another very exciting recent development in this 
area is work on the so-called order N methods for 
electronic structure calculations. At present, quan- 
tum mechanical calculations scale at least as N A 3 
in the large N limit, where N is the number of 
atoms in a unit cell. Significant progress has been 
made recently by several groups in developing 
methods which would scale as N. The success of 
these approaches would further enhance our 
ability to study very large molecular and materials 
systems including systems with perhaps thousands 
of atoms in the near future. 



Quantum Many-Body 

The quantum many body problem lies at the core 
of understanding many properties of materials. 
Over the last decade much of the classical 
methodology discussed above has been extended 
into the quantum regime in particular with the 
development of the path integral and Green's func- 
tion Monte Carlo methods. Early calculations of 
the correlation energy of the electron gas are exten- 
sively used in local density theory to estimate cor- 
relation energy in solids. The low temperature 
properties of liquid and solid helium, three and 
four, the simplest strongly correlated many-body 
quantum system, are now well understood thanks 
in large part to computer simulations. These quan- 
tum simulations have required thousands of hours 
on Cray-YMP class computers. 

While there still remain very difficult algorithmic 
issues, exact fermion methods and quantum 
dynamical methods to name two, the progress in 
the next decade should parallel the previous 
developments in classical statistical mechanics. 
Computer simulations of quantum many-body sys- 
tems will become a ubiquitous tool, integrated into 
theory and experiment. The software and 
hardware has reached a state where much larger, 
complex and realistic systems can be studied. 
Some particular examples are electrons in transi- 
tion metals, in restricted geometries, at high 
temperatures and pressures or in strong magnetic 
fields. Mean field theory is unreliable in many of 
these situations. However, these applications, if 
they are to become routine and widely distributed 
in the materials science community will require 
high performance hardware. Quantum simulations 
are naturally parallel and are likely to be among 
the first applications using massively parallel com- 
puters. 

Thanks to J. Bernholc, D. Ceperley, J. Joan- 
nopoulous, B. Harmon and S. Louie for their help 
in preparing this subsection. 
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Computational Molecular Biology /Chemistry/Biochemistry 



by Barry Homg 



Background 

There have been a number of revolutionary 
developments in molecular biology that have 
greatly expanded the need for high performance 
computing within the biological community. First, 
there has been an exponential growth in the num- 
ber of gene sequences that have been determined, 
and no end is in sight. Second, there has been a 
parallel (although slower) growth in the number of 
proteins whose structures have been determined 
from x-ray crystallography and, increasingly, from 
multidimensional NMR. This literal explosion in 
new information has led to developments in areas 
such as statistical analysis of gene sequences and 
of three dimensional structural data, new 
databases for sequence and structural information, 
molecular modeling of proteins and nucleic acids, 
and three-dimensional pattern recognition. The 
recognition of grand challenge problems such as 
protein folding or drug design has resulted in large 
part from these developments. Moreover, the con- 
tinuing interest both in sequencing the human 
genome and in the field of structural biology 
guarantees that computational requirements will 
continue to grow rapidly in the coming decade. 

To illustrate the type of problems that can arise, 
consider the case where a new gene has been iso- 
lated and its sequence is known. In order to fully 
exploit this information it is necessary to first ob- 
tain maximum information about the protein this 
gene encodes. This can be accomplished by search- 
ing a nucleic acid sequence data base for struc- 
turally or functionally related proteins, and/or by 
detecting sequence patterns characteristic of the 
three dimensional fold of a particular class of 
proteins. There are numerous complexities that 
arise in such searches and the computational 
demands imposed by the increasingly sophisti- 
cated statistical techniques that are being used can 
be imposing. 

A variety of methods, all of them requiring vast 
computational resources, are currently being ap- 



plied to the protein folding problem (predicting 
three dimensional structure from amino acid se- 
quence). Methods include statistical analyses (in- 
cluding neural nets) of homologies to known 
structures, approaches based on physical chemical 
principles and simplified lattice models of the type 
used in polymer physics. A major problem in un- 
derstanding the physical principles of protein and 
nucleic conformation is the treatment of the sur- 
rounding solvent. Molecular dynamics techniques 
are widely used to model the solvent but their ac- 
curacy depends on the potential functions that are 
used as well as the number of solvent molecules 
that can be included in a simulation. Thus, the 
technique is limited by the available computation- 
al power. Continuum solvent models offer an alter- 
native approach but these too are highly computer 
intensive. 

Even assuming a reliable method to evaluate free 
energies, the problem of conformational search is 
daunting. There are a large number of possible 
conformations available to a macromolecule and it 
is necessary to develop sneirods, such as Monte 
Carlo techniques with simulated annealing, to en- 
sure that the correct one has been included in the 
generated set of possibilities. A similar set of 
problems arises, for example, in the problem of 
structure-based drug design. In this case one may 
know the three-dimensional structure of a protein 
and it is necessary to design a molecule that binds 
tightly to a particular site on the surface. Efficient 
conformational search, energy evaluation and pat- 
tern recognition are requirements of this problem, 
all requiring significant computational power. 

Significant progress has been made in these and 
many other related areas. Ten years ago most cal- 
culations were made without including the effects 
of solvent. This situation has changed dramatically 
due to scientific progress that has been potentiated 
by the availability of significant computational 
power. Some of this has been provided by Super- 
computer centers while some has been made avail- 
able by increasingly powerful workstations. Fast 
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computers have also been crucial in the very 
process of three dimensional structure determina- 
tion. Both x-ray and NMR data analysis have ex- 
ploited methods such as molecular dynamics and 
simulated annealing to yield atomic coordinates of 
macromolecules. More generally, the new dis- 
cipline of structural biology, which involves the 
structure determination and analysis of biological 
macromolecules, has been able to evolve due the 
increased availability of high performance comput- 
ing. 

Future 

Despite the enormous progress that has been made 
the field is just beginning to take off. Gene se- 
quence analysis will continue to become more ef- 
fective as the available data continue to grow and 
as increasingly sophisticated data analysis techni- 
ques are applied. It will be necessary to make state- 
of-the-art sequence analysis available to individual 
investigators, presumably through distributed 
workstations and through access to centralized 
resources. This will require a significant training 
effort as well as the development of user-friendly 
programs for the biological community. 

There is enormous potential in the area of three 
dimensional structure analysis. There is certain to 
be major progress in understanding the physical 
che.nical basis of biological structure and func- 
tion. Improved energy functionals resulting from 
progress in quantum mechanics will become avail- 
able. Indeed a combination of quantum mechanics 
and reaction field methods will make it possible to 
obtain accurate descriptions of molecules in the 
condensed phase. The impact of such work will be 
felt in chemistry as well as in biology. Improved 
descriptions of the solvent through a combination 
of continuum treatments and detailed molecular 
dynamics simulations at the atomic level will lead 
to truly level descriptions of the conformational 
free energies and binding free energies of biologi- 
cal macromolecules. When combined with sophis- 
ticated conformational search techniques, 
simplified lattice models, and sophisticated statisti- 
cal techniques that identify sequence and struc- 
tural homologies, there is every reason to expect 
major progress on the protein folding problem. 



There will be parallel improvements in structure 
based design of biologically active compounds 
such as pharmaceuticals. Moreover, the develop- 
ment of new compounds based on biomimetic 
chemistry and new materials based on polymer 
design principles deduced from biomolecules 
should become a reality. All of this progress will 
require increased access to high performance com- 
puting for the reasons given above. The various 
simulation and conformational search techniques 
will continue to benefit dramatically from in- 
creased computational power. This will be true at 
the level of individual workstations, which a 
bench chemist for example might use to design a 
new drug. Work of this type often requires sophis- 
ticated three dimensional graphics and will benefit 
from progress in this area. Massively parallel 
machines which will certainly be required for the 
most ambitious projects. Indeed, it is likely that 
for some applications the need for raw computing 
power will exceed what is available in the foresee- 
able future. 

New developments in the areas covered in this sec- 
tion will have major economic impact. The 
biotechnology, pharmaceutical and general health 
industries are obvious beneficiaries but there will 
be considerable spin-off in materials science as 
well. 

Recommendations 

♦ Support the development of software in com- 
putational biology and chemistry. This should 
take the form of improved software and algo- 
rithms for workstations as well as the porting 
of existing programs and the development of 
new ones on massively parallel machines. 

♦ Make funds available for training that will ex- 
ploit new technologies and for familiarizing 
biologists with existing technologies. 

+ Funding should be divided betw* ^n large 
centers, smaller centers involving a group of 
investigators at a few sites developing new 
technologies, and individual investigators. 
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Molecular Modeling and Quantum Chemistry 



by William Lester 



The need for high performance computing has 
been met historically by large vector supercom- 
puters. It is generally agreed in the computer and 
computational science communities that sig- 
nificance improvements in computational efficien- 
cy will arise from parallelism. The advent of 
conventional parallel computer systems has 
generally required major computer code restruc- 
ture to move applications from vector serial com- 
puters to parallel architectures. The move to 
parallelism has occurred in two forms: distributed 
MIMD machines and clusters of workstations with 
the former receiving the focus of attention in large 
multi-user center facilities and the latter in local re- 
search installations. 

The tremendous interest in the simulation of 
biological processes at the molecular level using 
molecular mechanics and molecular dynamics 
methods has led to continuing increase in demand 
of computational power. Applications have poten- 
tially high practical value and include, for ex- 
ample, the design of inhibitors for enzymes that 
are suspected to play a role in disease states and 
the effect of various carcinogens on the structure 
of DNA. 

In the first case, one expects that a molecule 
designed to conform within the three-dimensional 
arrangement of the enzyme structure should be 
bound tightly to the enzyme in solution. This re- 
quirement, and others, make it desirable to know 
the tertiary structure of the enzyme. The use of 
computation for this purpose is contributing sig- 
nificantly to the understanding of those structures 
which are then used to guide organic synthesis. 

In the second case, serial vector supercomputers 
typically can carry out molecular dynamics simula- 
tions of DNA for time frames of only picoseconds 
to nanoseconds. A recent calculation of 200-ps in- 
volving 3542 water molecules and 16 sodium ions 
took 140 hours of Cray Y-MP time. Extending 
such calculations to the millisecond or even the 
second range where important motions can occur 



remains a major computational challenge that will 
require the use of massively parallel computer sys- 
tems. 

Although in molecular mechanics or force field 
methods, computational effort is dominated by the 
evaluation of the force field that gives the poten- 
tial energy as a function of internal coordinates of 
the molecules and non-bonded interactions be- 
tween atoms, a popular approach for small organic 
molecules is the ab initio Hartree-Fock (HF) 
method which has come into routine use by or- 
ganic and medicinal chemists to study compounds 
and drugs. The HF method is ab initio because the 
calculation depends only on basic information 
about the molecule, including the number and 
types of atoms and the total change. The computa- 
tional effort of HF computations scales as N4, 
where N is the number of basis functions used to 
describe the atoms of the molecule. Because the 
HF method describes only the "average" behavior 
of electrons, it typically provides a better descrip- 
tion of relative geometries than of energetics. The 
accurate treatment of the latter requires proper ac- 
count of the instantaneous correlated motions of 
electrons which inherently is not described by the 
HF method. 

For systems larger than those accessible with the 
HF methods, one has, in addition to molecular 
mechanics methods, semiemperical approaches. 
Their name arises out of the use of experimental 
data to parameterize integrals and other simplifica- 
tions of the HF method leading to a reduction of 
computational effort to order N3. Results of these 
methods can be informative for systems where 
parametrization has been performed. 

Recently, the density functional (DF) method has 
become popular, overcoming deficiencies in ac- 
curacy for chemical applications that limited ear- 
lier use. Improvements have come in the form of 
better basis sets, advances in computational algo- 
rithms for solving the DF equations, and the 
development of analytical geometry optimization 
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methods. The DF method is an ab initio approach 
that takes into account electron correlation. In 
view of the latter capability, it can be used to study 
a wide variety of systems, including metals and in- 
organic species. 

The move to parallel systems has turned out to be 
a major undertaking in software development for 
the approaches described. Serious impediments 
have been encountered in algorithm modification 
for methods that go beyond the ab initio HF 
method, and in steps to maximize efficiency with 
increased numbers of processors. These cir- 
cumstances have increased interest in quantum 
Monte Carlo (QMC) methods for electronic struc- 
ture. In addition, QMC methods have been used 
with considerable success for the calculation of 
vibrational eigenvalues, and in statistical 
mechanics studies. 

QMC, as used here in the context of electronic 
structure, is an ab initio method for solving the 
Schroedinger equation stochastically based on the 
formal similarity between the Schroedinger equa- 
tion and the classical diffusion equation. The 
power of the method is that it is inherently an N- 
body method that can capture all of the instan- 
taneous correlation of the electrons. The QMC 
method is readily ported to parallel computer sys- 
tems with orders of magnitude savings in computa- 
tional effort over serial vector supercomputers. 

In the statistical mechanical studies of complex 
systems, one is often interested in the spontaneous 
formation and energetics of structure over large 
length scales. The mesoscopic structures, such as 
vesicles and lamellars, formed from self assembly 
in oil/water-surfactant mixtures are important ex- 
amples. The systematic analysis of these 
phenomena have only recently begun, and com- 
puter simulation is one of the important tools in 
this analysis. Due to the large length scales in- 
volved, simulation is necessarily confined to very 
simple classes of models. Even so, the work pres- 
ses the capabilities of current computational equip- 
ment to their limits. While the stability of various 
nontrivial structures have been documented, we 
are still far from understanding the rich phase 
diagram in such systems. The work of Smit and 
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coworkers using transputors demonstrates the 
utility of parallelization in these simulations. Fu- 
ture equipment should carry us much further 
towards understanding. 

Competing interactions and concomitant "frustra- 
tion" characterizes complex fluids and the result- 
ing mesoscopic structures. Such competition is 
also a central feature of "random polymers" - a 
model for proteins and also for manufactured 
polymers. Computer simulation studies of random 
polymers can be extremely useful. Though here 
too, the computations of even the simplest models 
press the limits of current technology. To treat this 
class of systems, Binder and coworkers and 
Frenkel and coworkers have developed new algo- 
rithms, some of which are manifestly paral- 
lelizable. Thus, this area is one where the new 
computer technology should be very helpful. 

Along with large length scale fluctuations, as in 
self assembly and polymers, simulations press cur- 
rent computational equipment where relaxation oc- 
curs over many orders of magnitude. This is the 
phenomena of glasses. Here, the work of Fredrick- 
son on the spin-facilitated Ising system 
demonstrates the feasibility of parallelization in 
studying long time relaxation and the glass transi- 
tion by simulation. 

Polymers and glasses are examples pertinent to the 
understanding and design of advanced materials. 
In addition, one needs to understand the electronic 
and magnetic behavior of these and other con- 
densed matter systems. In recent years, a few 
methods have appeared, especially the Car-Par- 
rinello approach, which now makes feasible the 
calculation of electronic properties of complex 
materials. The calculations are intensive. For ex- 
ample, studying the dynamics and electronic struc- 
ture of a system with only 64 atoms, periodically 
replicated, for only a picosecond is at the limits of 
current capabilities. With parallelization, and 
simplified models, one can imagine, however, sig- 
nificant progress in our understanding of metalin- 
sulator transitions and localization in correlated 
disordered systems. 



Mathematics and High Performance Computing 



by James Sethian 



A. Introduction 

Mathematics underlies much, if not all, of high 
performance computing. At first glance, it might 
seem that mathematics, with its emphasis on 
theorems and proofs, might have little to con- 
tribute to solving large problems in the physical 
sciences and engineering. On the contrary, in the 
same way that mathematics contributes the under- 
lying language for problems in the sciences, en- 
gineering, discrete systems, etc., mathematical 
theory underlies such factors as the design and un- 
derstanding of algorithms, error analysis, ap- 
proximation accuracy, and optimal execution. 
Mathematics plays a key role in the drive to 
produce faster and more accurate algorithms 
which, in tandem with hardware advances, 
produce state-of-the-art simulations across the 
wide spectrum of the sciences. 

At the same time, high performance computing 
provides a valuable laboratory tool for many areas 
of theoretical mathematics such as number theory 
and differential geometry. At the heart of most 
simulations lies a mathematical model and an algo- 
rithmic technique for approximating the solution 
to this model. Aspects of such areas as approxima- 
tion theory, functional analysis, numerical 
analysis, probability theory, and the theory of dif- 
ferential equations provide valuable tools for 
designing effective algorithms, assessing their ac- 
curacy and stability, and suggesting new techniques. 

What is so fascinating about the intertwining of 
computing and mathematics is that each in- 
vigorates the other. For example, understanding of 
entropy properties of differential equations have 
led to new methods for high resolution shock 
dynamics, approximation theory using multipoles 
has led to fast methods for N-body problems, 
methods from hyperbolic conservation laws and 
differential geometry have produced exciting 
schemes for image processing, parallel computing 
has spawned new schemes for numerical linear al- 
gebra and multi-grid techniques, and methods 



designed for tracking physical interfaces have 
launched new theoretical investigations in differen- 
tial geometry, to name just a few. Along the way, 
this interrelation between mathematics and com- 
puting has brought breakthroughs in such areas as 
material science (such as new schemes for 
solidification and fracture problems), computation- 
al fluid dynamics (e.g., high order projection 
methods and sophisticated particle schemes), com- 
putational physics (such as new schemes for Ising 
models and percolation problems), environmental 
modeling (such as new schemes for groundwater 
transport and pollutant modeling) and combustion 
(e.g. new approximation models and algorithmic 
techniques for flame chemistry/fluid mechanical 
interactions). 

B. Current State 

Mathematical research which contributes to high 
performance computing exists across a wide 
range. On one end are individual investigators or 
small, joint collaborations. In these settings, the 
work takes a myriad of forms; brand-new algo- 
rithms are invented which can save an order of 
magnitude speedup in computer resources, exist- 
ing techniques are analyzed for convergence 
properties and accuracy, and model problems are 
posed which can isolate particular phenomena. 
For example, such work includes analysts working 
on fundamental aspects of the Navier-Stokes equa- 
tions and turbulence theory (c.f. the following sec- 
tion on Computational Fluid Dynamics), applied 
mathematicians designing new algorithms for 
model equations, discrete mathematicians 
focussed on combinatorics problems, and numeri- 
cal linear algebraists working in*optimization 
theory. At the other end are mathematicians work- 
ing in focussed teams on particular problems, for 
example, in combustion, oil recovery, 
aerodynamics, material science, computational 
fluid dynamics, operations research, cryptography, 
and computational biology. 
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Institutionally, mathematical work in high perfor- 
mance computing is undertaken at universities, the 
National Laboratories, the NSF High Performance 
Computing Centers, NSF Mathematics Centers 
(such as the Institute for Mathematical Analysis 
and the Mathematical Sciences Research Institute, 
and the Institute for Advanced Study), and across 
a spectrum of industries. In recent years, high per- 
formance computing has become a valuable tool 
for understanding subtle aspects of theoretical 
mathematics. For example, computing has revolu- 
tionized the ability to visualize and evolve com- 
plex geometric surfaces, provided techniques to 
untie knots, and helped compute algebraic struc- 
tures. 

C. Recommendations 

Mathematical research is critical to ensure state-of- 
the-art computational and algorithmic techniques 
which foster the efficient use of national comput- 
ing resources. In order to promote this work, it is 
important that : 

1. Mathematicians be supported in their need 
to have access to the most advanced comput- 



ing systems available, both through net- 
works to supercomputer facilities, on-site ex- 
perimental machines (such as parallel 
processors), and individual high-speed 
workstations. 

2. State-of-the-art research in modeling, new 
algorithms, applied mathematics, numerical 
analysis, and associated theoretical analysis 
be amply supported; it is this work that con- 
tinually and continuously rejuvenates com- 
putational techniques. Without it, 
yesterday's algorithms will be running on 
tomorrow's machines. 

3. Such research be supported on all levels; the 
individual investigator, small joint collabora- 
tions, interdisciplinary teams, and large 
projects. 

4. Funding be significantly increased in the 
above areas, both to foster frontier research 
in computational techniques, and to use 
computation as a bridge to bring mathe- 
matics and the sciences closer together. 



Computational Fluid Dyanmics 
by James Sethian 



Introduction 

The central goal of computational fluid dynamics 
(CFD) is to follow the evolution of a fluid by solv- 
ing the appropriate equations of motion on a 
numerical computer. The fundamental equations 
require that the mass, momentum, and energy of a 
liquid/gas are conserved as the fluid moves. In all 
but the simplest cases, these equations are too dif- 
ficult to solve mathematically, and instead one 
resorts to computer algorithms which approximate 
the equations of motion. The yardstick of success 
is how well the results of numerical simulation 
agree with experiment in cases where careful 
laboratory experiments can be established, and 
how well the simulations can predict highly com- 
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plex phenomena that cannot be isolated in the 
laboratory. The effectiveness and versatility of a 
computational fluid dynamics simulation rests on 
several factors. First, the underlying model must 
adequately describe the essential physics. Second, 
the algorithm must accurately approximate the 
equations of motion. Third, the computer program 
must be constructed to execute efficiently. And 
fourth, the computer equipment must be fast 
enough and large enough to calculate the answers 
sufficiently rapidly to be of use. Weaving these 
factors together, so that answers are accurate, reli- 
able, and obtained with acceptable cost in an ac- 
ceptable amount of time, is both an art and a 
science. Current uses of computational fluid 
dynamics range from analysis of basic research 
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into fundamental physics to commercial applica- 
tions. While the boundaries are not sharp, CFD 
work may be roughly categorized in three ways: 
Fundamental Research, Applied Science, and In- 
dustrial Design and Manufacturing 

CFD and Fundamental Research 

At many of the nation's research universities ar j 
national laboratories, much of the focus of CFD 
work is on fundamental research into fluid flow 
phenomena. The goal is to understand the role that 
fluid motion plays in such areas as the evolution 
of turbulence in the atmosphere and in the oceans, 
the birth and evolution of galaxies, atmospherics 
on other planets, the formation of polymers, the 
physiological fluid flow in the body, and the inter- 
play of fluid mechanics and material science such 
as in the physics of superconductors. In these 
simulations, often employing the most advanced 
and sophisticated algorithms, the emphasis is on 
accurate solutions and basic insight. These calcula- 
tions are often among the most expensive of all 
CFD simulations, requiring many hundreds of 
hours of computer time on the most advanced 
machines available for a single simulation. The 
modeling and algorithmic techniques for such 
problems are constantly under revision and refine- 
ment. For the most part, the major advances in 
new algorithmic tools, from schemes to handle the 
associated numerical linear algebra to high order 
methods to approximate difference equations, 
have their roots in basic research into CFD applied 
to fundamental physics. 

CFD and Applied Science 

Here, the main emphasis is on the application of 
the tools of computational fluid dynamics to 
problems motivated by specific problems such as 
might occur in natural phenomena or physical 
processes. Such work might include detailed 
studies of the propagation of flames in engines or 
fire research in closed rooms, the fluid mechanics 
involved in the dispersal of pollutants or toxic 
groundwater transport, the hydraulic response of a 
proposed heart valve, the development of severe 
storms in the atmosphere, and the aerodynamic 
properties of a proposed space shuttle design. For 



the most part, this work is also carried out 
throughout the national laboratories and univer- 
sities with government support While cost is not a 
major issue in these investigations, the focus is 
more on obtaining answers to directed questions. 
Less concerned with the algorithm for its own 
sake, this work links basic research with commer- 
cial CFD applications, and provides a stepping 
stone for advances in algorithms to propagate into 
industrial sectors. 

CFD and Industrial Design and Manufacturing 

The focus in this stage of the process is on apply- 
ing the tools of computational fluid dynamics to 
solve problems that directly relate to technology. 
A vast array of examples exist, such as the 
development of a high-speed inkjet plotter, the ac- 
tion of slurry beds for processing minerals, 
analysis of the aerodynamic characteristics of an 
automobile or airplane, efficiency analysis of an 
internal combustion engine, performance of high- 
speed computer disk drives, and optimal pouring 
and packaging techniques in manufacturing. For 
the most part, such work is carried out in private 
industry, often with only informal ties to academic 
and government scientists. Communication of new 
ideas rests loosely on the influx of new employees 
trained in the latest techniques, journal articles, 
and professional conferences. A distinguishing 
characteristic of this work is its emphasis on turn- 
around time and cost Here, cost is not only the 
cost of the equipment to perform the calculation, 
but the people- years involved in developing the 
computer code, and the time involved performing 
may hundreds of simulations as part of a detailed 
parameter study. This motivation is quite different 
from that in the other two areas. The need to per- 
form a large number of simulations under extreme- 
ly general circumstances may mean that a simple 
and fast technique that attacks only a highly 
simplified version of the problem may be 
preferable to a highly sophisticated and accurate 
technique that requires many orders of magnitude 
more computational effort. This orientation is lies 
at the heart of the applicability and suitability of 
computational techniques to a competitive in- 
dustry. 
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Future of CFD 

« Modeling and Algorithmic Issues. 

In many ways, the research, applied science, 
and industry agenda in CFD has changed, 
mostly in response to increased computation- 
al power coupled to signific *nt algorithmic 
and theoretical/numerical advances. On the 
research side, up through the early 1980's, 
the emphasis was on basic discretization 
methods. In that setting, it was possible to 
develop methods based on looking at simple 
problems in two, or even one space dimen- 
sion, in simple geometries, and in a fair de- 
gree of isolation from the fluid dynamics 
applications. Over the last five years, how- 
ever, there has been a transition to the next 
generation of problems. These problems are 
more difficult in part because they are at- 
tempting more refined and detailed simula- 
tions, necessitating finer grids and more 
computational elements. However, a more 
fundamental issue is that these problems are 
qualitatively different from those previously 
considered. To begin, they often involve com- 
plex and less well-understood physical 
models - chemically reacting fluid flows, 
flows involving multiphase or multicom- 
ponent mixtures of fluids or other complex 
constitutive behavior. They are often set in 
three dimensions, in which both the solution 
geometry and the boundary geometry are 
more complicated than in two dimensions. 
Finally, they often involve resolving multiple 
length and time scales, such as boundary and 
interior layers coming from small diffusive 
terms, and intermittent large variations that 
arise in fluid turbulence. Algorithmically, 
these require work in several areas. Complex 
physical behavior makes it necessary to 
develop a deeper understanding of the 
physics and modeling than had previously 
been required. Additional physics requires 
the theory and design of complex boundary 
conditions to couple the fluid mechanics to 
the rest of the problem. Multiple length 
scales and complex geometries lead to 
dynamically adaptive methods, since 
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memory and compute power are still insuffi- 
cient to brute force through most problems. 
For example, the wide variation in length and 
time scales in turbulent combustion require a 
host of iterative techniques, stiff ode solvers, 
and adaptive techniques. Further algorithmic 
advances are required in areas such as 
domain decomposition, grid partitioning 
error estimation, and preconditioners ani 
iterative solvers for non-symmetric, non- 
diagonally dominant matrices. Mesh genera- 
tion, while critical, has not progressed far. 

All in all, to accomplish in three-dimensional 
complex flow what is now routine in two- 
dimensional basic flow will require theory, 
numerics, and considerable cleverness. The 
net effect of these developments is to make 
the buy-in to perform CFD research much 
higher. Complex physical behavior makes it 
necessary to become more involved with the 
physics modeling than had previously been 
the case. The problems are sufficiently dif- 
ficult that one cannot blindly throw them at 
the computer and overwhelm them. A consid- 
erable degree of mathematical, numerical, 
and physical understanding must be obtained 
about the problems in order to obtain effi- 
cient and accurate solution techniques. On 
the industrial side, truly complex problems 
are still out-of-reach. For example, in the 
aircraft industry, we are still a long way from 
a full Navier-Stokes high Reynolds number 
unsteady flow around a commercial aircraft. 
Off in the distance are problems of takeoff 
and landing, multiple wings in close relation 
to each other, and flight recovery from sud- 
den changes in conditions. In the automotive 
industry, a solid numerical simulation of the 
complete combustion cycle (as opposed to a 
time-averaged transport model) is still many 
years away. Other automotive CFD problems 
include analysis of coolant flows, thermal 
heat transfer, plastic mold problems, and 
sheet metal formation. 

♦ High Performance Computing Issues. 



The computer needs to continue the next five 
years of CFD work are substantial. As an ex- 
ample, an unforced Navier-Stokes simulation 
might require 1000 cells in each of three 
space dimensions. Presuming 100 flops per 
cell, and 25,000 time steps, this yields 10 9 x 
10 2 x 2.4 x 10 4 = 2.5 x 10 15 flops; a typical 
compute time of 2 hours would then require a 
machine of 300 gigaflops. Adding forcing, 
combustion, or other physics severely ex- 
tends this calculation. As a related issue; 
memory requirements pose an additional 
problem. Ultimately, more compute power 
and memory is needed. The promise of a 
single, much faster, larger vector machine is 
not being made convincingly, and CFD is at- 
tempting to adapt accordingly. Here is an 
area where the dream of parallelism is both 
tantalizing and frustrating. To begin, parallel 
computing has naturally caused, emphasis on 
issues related to processor allocation and load 
balancing. To this end, communications cost 
accounting (as opposed to simply accounting 
for floating point costs) has become impor- 
tant in program design. For example, parallel 
machines can often have poor cache memory 
management and a limited number of paths 
to/from main memory; this can imply a iong 
memory fetch/store time, which can result in 
actual computational speeds for real CFD 
problems far below the optimal peek speed 
performance. To compensate, parallel com- 
puters tend to be less memory efficient than 



vector machines, as space is exchanged for 
communication time (duplicating data where 
possible rather than sending it between 
processors). The move to parallel machines is 
complicated by the fact that millions of lines 
of CFD codes have been written in the 
serial/vector format The instability of the 
hardware platforms, the lack of a standard 
global high performance Fortran and C, the 
lack of complete libraries, and insecurity as- 
sociated with a volatile industry all contribute 
to the caution and reluctance of all but the 
most advanced research practitioners of CFD. 

Recommendations 

In order to tackle the next generation of CFD 
problems, the field will require: 

♦ Significant accessibility to the fastest current 
coarse-grained parallel machines. 

♦ Massively parallel machines with large 
memory, programmable under stable 
programming environments, including high 
performance Fortran and C, mathematical 
libraries, functioning I/O systems, and ad- 
vanced visualization systems. 

♦ Algorithmic advances in adaptive meshing, 
grid generation, load balancing, and high 
order difference, element, and particle 
schemes. 

♦ Modeling and theoretical advances coupling 
fluid mechanics to other related physics. 



High Performance Computing In Physics 
by James Sethian and Neal Lane 



INTRODUCTION AND BACKGROUND 

High Energy Physics 

Two areas in which high performance computing 
plays a crucial role are lattice gauge theory and the 
analysis of experimental data. Lattice gauge 
theory addresses some of fundamental theoretical 
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problems in high energy physics, and is relevant 
to experimental programs in high energy and 
nuclear physics. In the standard model of high 
energy physics, the strong interactions are 
described by quantum chromodynamics (QCD). 
In this theory the forces are so strong that the fun- 
damental entities, the quarks and gluons, are not 
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observed directly under ordinary laboratory condi- 
tions. Instead one observes their bound states, 
protons and neutrons, the basic constituents of the 
atomic nucleus, and a host of short lived particles 
produced in high energy accelerator collisions. 
One of the major objectives of lattice gauge theory 
is to calculate the masses and other basic proper- 
ties of these strongly interacting particles from 
first principles, and provide a test of QCD, as well 
as suggest that the same tools could be used to cal- 
culate additional physical quantities which may 
not be so well determined experimentally. In addi- 
tion, lattice gauge theory provides an avenue for 
making first principle calculations of the effects of 
strong interactions on weak interaction processes, 
and thus holds the promise of providing crucial 
tests on the standard model at its most vulnerable 
points. And, although quarks and gluons are not 
directly observed in the laboratory, it is expected . 
that at extremely high temperatures one would 
find a new state of matter consisting of a plasma 
of these particles. The questions being addressed 
by lattice gauge theorists are the nature of the tran- 
sition between the lower temperature state of ordi- 
nary matter and the high temperature quark-gluon 
plasma, the temperature at which this transition oc- 
curs, and the properties of the plasma. In the 
general area of experimental high energy physics, 
there are three primary areas of computing: 

1 ) The processing of the raw data that is usual- 
ly accumulated at a central accelerator 
laboratory, such as the Wilson Laboratory at 
Cornell. 

2) The simulation of physical processes of in- 
terest, and the simulation of the behavior of 
the final states in detector. 

3) The analysis of the compressed data that 
results from processing of the raw data and 
the simulations. 

Atomic and Molecular Physics 

In contrast to some areas of theoretical physics, 
the AM theorist has the advantage that he under- 
stands the basic equations governing the evolution 
of the system of particles under consideration. 
However, the wealth of phenomena that derive 
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from the many-body interactions of the con- 
stituents and their interactions with external 
probes such as electric and magnetic fields are 
truly astounding. Computation now provides a 
practical and useful alternative method to study 
these problems. Most importantly, it is now pos- 
sible to perform calculations sophisticated enough 
to have a real impact on AM science. These in- 
clude high precision computations of the ground 
and excited states of small molecules, scattering of 
electrons from atoms, atomic ions and small 
polyatomic molecules, simple chemical reactions 
involving atom-diatom collisions, photoionization 
and photodissociation and various time dependent 
processes such as multiphoton ionization and the 
interaction of atoms with ultra strong ( or short ) 
electromagnetic fields. 

Gravitational Physics 

The computational goal of classical and astrophysi- 
cal relativity is the solution of the associated non- 
linear partial differential equations. For example, 
simulations have been performed of the critical be- 
havior in black hole formation using high ac- 
curacy adaptive grid methods and which follow 
the collapse of spherical scalar wave pulses over at 
least fourteen orders of magnitude, of the structure 
we currently see in the universe (galaxies, clusters 
of galaxies arranged in sheets, voids etc.) and how 
it may have arisen through fluctuations generated 
during an inflationary epoch, of head-on collisions 
of black holes, and of horizon behavior in a num- 
ber of black hole configurations. 

CURRENT STATE 

Definitive calculations in all of the areas men- 
tioned above would require significantly greater 
computing resources than have been available up 
to now. Nevertheless steady progress has been 
made during the last decade due to important im- 
provements in algorithms and calculational techni- 
ques, and very rapid increases in available 
computing power. For example, among the major 
achievements in lattice gauge theory have been: a 
demonstration that quarks and gluons are confined 
at low temperatures; steady improvements in 
spectrum calculations, which have accelerated 



markedly in the last year; an estimate of the transi- 
tion temperature between the ordinary state of mat- 
ter and the quark-gluon plasma; a bound on the 
mass of the Higgs boson; calculations of weak 
decay parameters including a determination of '.te 
mixing parameter; and a determination of the 
strong coupling constant at the energy scale of 5 
GeV from a study of the charmonium spectrum. 
Much of this work has been carried out at the NSF 
Supercomputer Centers. In the area of atomic and 
molecular physics, until quite recently most AM 
theorists were required to compute using 
simplified models or, if the research was computa- 
tionally intensive, to use vector supercomputers. 
This has changed with the widespread availability 
of cheap, fast, Unix based, RISC workstations. 
These "boxes" are now capable of performing at 
the 40-50 megaflop level and can have as much as 
256 megabytes of memory. In addition,it is pos- 
sible to cluster these workstations and distribute 
the computational task among the cpu's. There are 
a few researchers in the US who have as many as 
twenty or thirty of these workstations for their 
own group. This has enabled computational experi- 
ments using a loosely coupled, parallel model on 
selected problems in AM physics. However, this is 
not the norm in AM theory. More typically our 
most computationally intensive calculations are 
still performed on the large mainframe, vector su- 
percomputers available only to a limited number 
of users, The majority of the AM researchers in 
the country are still computing on single worksta- 
tions or ( outdated ) mainframes of one kind or 
another. 

FUTURE COMPUTING NEEDS 

In the field of high energy physics, the processing 
of raw data and the simulation of generic mixing 
processes are activities that are well-suited to a 
centralized computer center. The processing of the 
raw data should be done in a consistent, or- 
ganized, and reliable manner. Often, in the middle 
of the data processing, special features in some 
data are discovered, and these need to be treated 



quickly and in a manner consistent with the entire 
sample. It is usual and logical that the processing 
take place at the central accelerator center where 
the apparatus sits, because the complete records of 
the data taking usually reside there. In contrast, 
the high energy physics work in simulation of 
sp r % flc processes and analysis of compressed data 
are well matched to individual University 
groups.In terms of computing needs, all of the 
above are well served either by a large, powerful, 
central computer system, or by a cluster or farm of 
workstations. For example, CERN does the 
majority of its computing on central computers, 
while the Wilson Lab has a farm of DECstation 
5000/240's to process its raw data. High energy 
computing usually proceeds one event at a time; 
each event, whether from simulation or raw data, 
can be dealt to an individual workstation for 
processing. There is no need to put the entire 
resources of a supercomputer on one event. How 
ever, massively parallel computers have the poten- 
tial to handle large numbers of events 
simultaneously in an efficient manner. These 
machines will also be of importance in the work 
on lattice gauge theory. Conversely, university 
groups are well served by the powerful worksta- 
tions now available, which has freed them from de- 
pendence on the central laboratory to study the 
physics signals of interest to them. These groups 
need both fast CPUs, for simulation of data, as 
well as relatively large and fast disk farms, for 
repeated processing of the compressed data. In the 
field of atomic and molecular modeling, the future 
lies in the use of massively parallel, scaleable mul- 
ticomputers. AM theorists, with rare exceptions, 
have not been as active as other disciplines in 
moving to these platforms. This is to be contrasted 
with the quantum chemists, the lattice QCD 
theorists and many materials scientists who are be- 
coming active users of these computers. The lack 
of portability of typical AM codes and the need to 
expend lots of time and effort in rewriting or 
rethinking algorithms has prevented a mass migra- 
tion to these platforms. 
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Accomplishments in Computer Science and Engineering since the Lax Report 



by Mary K. Vernon 



While vector multiprocessors have been the 
workhorses for many fields of computational 
science and engineering over the past ten years, re- 
search in computer science and engineering has 
been focused both on improving the capabilities of 
these systems, and on developing the next genera- 
tion of high-performance computing systems — 
namely the scalable, highly parallel computers 
which have recently been commercially realized 
as systems such as the Intel Paragon, the Kendall 
Square Research KSR- 1 , and the Thinking 
Machines Corporation CM-5. 

A variety of factors make scalable, highly parallel 
computers the only viable way to achieve the 
teraflop capability required by Grand Challenge 
applications. These systems represent far more 
than an evolutionary step from their modestly 
parallel vector predecessors. Realizing the teraflop 
potential of massively parallel systems requires ad- 
vances in a broad range of computer science and 
engineering subareas, including VLSI, computer 
architecture, operating systems, runtime systems, 
compilers, programming languages, and algo- 
rithms. The development of the new capabilities in 
turn requires computationally intensive experimen- 
tation and/or simulations that have been carried 
out on experimental prototypes (e.g., the NYU 
Ultracomputer), early commercial parallel 
machines (such as the BBN Butterfly or the Intel 
iPSC/2), and more recently on high-performance 
workstations as well as the emerging massively 
parallel systems such as the Thinking Machines 
CM5 and the Intel Paragon. 

Computer science and engineering researchers 
have made tremendous progress in the past ten 
years in the development of high performance 
computing technology, including the development 
of ALL of the major technologies in massively 
parallel systems. Among the specific accomplish- 
ments are: 

♦ development of RISC processor technology 
and the compiler technology for RISC proces- 
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sors, which is used in all high performance 
workstations as well as in the massively paral- 
lel machines. 

♦ development of computer-aided tools to 
facilitate the design, testing, and fabrication 
of complex digital systems, and their con- 
stituent components. 

♦ invention of the multicomputer and develop- 
ment of the message-passing programming 
paradigm which is used in many of today's 
massively parallel systems. 

♦ refinement of shared memory architectures 
and the shared memory programming model 
which is used in the KSR-1, the Cray T3D, 
and other emerging massively parallel 
machines. 

♦ invention of the hypercube interconnection 
network and refinement of this network to 
lower-dimensional, 2-d and 3-d, mesh net- 
works that are currently used in the Intel 
Paragon and the Cray T3D 

♦ invention of the fat-tree interconnection net- 
work which is currently used in the Thinking 
Machines CM-5. 

♦ refinement of the SIMD architecture which is 
used for example in the Thinking Machines 
CM2 and the MasPar/1. 

♦ invention and refinement of the SPMD and 
data parallel programming models which are 
supported in several massively parallel sys- 
tems. 

♦ development of the technology underlying 
the mature compilers for vector machines 
(i.e., compilers that give delivered perfor- 
mance that is a substantial fraction of the 
theoretical peak performance of these 
machines.) 
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♦ development of the technology underlying all 
of the existing compilers for parallel 
machines, 

♦ development of the Mach operating system, 
which provided the basis for the OSF/1 stand- 
ard used, for example, in the Intel Paragon. 

. ♦ development of light-weight and wait-free 
synchronization primitives 

♦ development of performance debugging tools 

♦ development of high-performance database 
technologies, including both algorithms and 
architectures that have influenced emerging 
systems, for example from NCR, Teradata, 
and IBM. 

♦ development of parallel algorithms for high- 
performance optimization 

♦ development of parallel algorithms for 
numerical linear algebra 

♦ development of machine learning technology 
for computational biology 

In other words, key hardware, system software, 
and algorithm technologies are directly the result 
of computer science and engineering research 
across a broad range of subdisciplines. Much of 



this work has been highly experimental, and has 
made extensive use of current-generation, early 
commercial, and prototype high performance sys- 
tems. For example, simulations of next-generation 
architectures, multi-user database systems, and the 
like, as well as the development and testing of 
new algorithms for large-scale optimization, 
numerical linear algebra, computational biology, 
and the like, often require days of simulation time 
on the most advanced platforms available. 

Research efforts today are focused on improving 
the capabilities, performance, and ease of use of 
parallel machine technology, including the 
capabilities of workstation networks. Experiments 
to evaluate the technology for next-generation sys- 
tems, like many other applications that would be 
classified as "computational engineering", require 
the highest performance systems available. In addi- 
tion, simulation and/or testing of innovations in 
computer architecture or operating systems some- 
times involve modifications to the host hardware 
and/or operating software. These modifications 
can be developed and debugged on medium-scale 
versions of the high-end systems. Support for such 
initial development, as well as porting working 
modifications to larger-scale systems for further 
test, is critical to the rapid development of new 
HPC technologies. 
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